Fazl Barez

AI Safety Researcher

I'm a Research Fellow at Torr vision Group (TVG), University of Oxford, where I lead the safety research. I'm also a Technology and Security Policy fellow at RAND and the Co-Director and Head of research at Apart Research. I'm also affiliated with the centre for the Study of Existential Risks at University of Cambridge and Future of Life Institute. Prior to that, I worked on Interpretability at Amazon, safe recommender systems at Huawei and on building a finance tool for budget management for economic scenario forecasting at Natwest Group.

I hold a PhD in AI with focus on safety and interpretability and previously studied an MSc in Statistics and BA (Hons) in Economics.

  • 2023-12-09 I presented Measuring Value Alignment at NuerIPS23


I care about making AI systems safe and beneficial to humanity. My interests in machine learning research include interpretability, Safety, Alignment, and Evaluation. I am also interested in learning about how knowledge of philosophy, psychology and neuroscience can advance our understanding of how AI models learn, adap, distribute and update knowledge.


* = equal contribution



  • Future of Humanities Institute PhD affiliate
  • ESPRC PhD Student scholarship (full tuition and stipend)
  • MSc Scholarship (Full tuition)
  • BA (Hons) Sports performance Scholarship (Partial Tuition and stipend)


  • I co-founded and help run AISHED, I run the ai safety reading group at Oxford, and mentor students from disadvantaged backgrounds (if you are one, please get in touch).
  • I have been a reviewer for NeurIPS, ICLR, ICML, ACL, EMNLP, EACL and NACCL.
  • I have also organised workshops at EACL.


In the past I have consulted with think tanks, charities, education and financial firms.