I'm a Research Fellow at Torr vision Group (TVG), University of Oxford, where I lead the safety research. I'm also a Technology and Security Policy fellow at RAND and the Co-Director and Head of research at Apart Research. I'm also affiliated with the centre for the Study of Existential Risks at University of Cambridge and Future of Life Institute. Prior to that, I worked on Interpretability at Amazon, safe recommender systems at Huawei and on building a finance tool for budget management for economic scenario forecasting at Natwest Group.
I hold a PhD in AI with focus on safety and interpretability and previously studied an MSc in Statistics and BA (Hons) in Economics.
I care about making AI systems safe and beneficial to humanity. My interests in machine learning research include interpretability, Safety, Alignment, and Evaluation. I am also interested in learning about how knowledge of philosophy, psychology and neuroscience can advance our understanding of how AI models learn, adap, distribute and update knowledge.
In the past I have consulted with think tanks, charities, education and financial firms.