AI Safety · Alignment · Reasoning

Reza Aghajani

Research Scientist at Meta Superintelligence Lab, working on safety, alignment, and reasoning capabilities of agentic AI systems. Previously at Google DeepMind and YouTube, where I built LLM-based agents and models to combat misinformation at scale. Ph.D. in Applied Mathematics from Brown University, with 8+ years of experience developing and serving large-scale AI models.

Agentic AI Safety Chain-of-Thought LLM Post-Training Reinforcement Learning Stochastic Processes
RA
AI Safety
Agentic AI Safety via Chain-of-Thought Monitoring
Developing robust safety frameworks for autonomous AI agents by monitoring intermediate reasoning steps. Extends the Deliberative Alignment method for Agentic Safety via SFT- and RL-based Post-Training.
Publications
1
MSL Preparedness & Red Teaming & Alignment Team (incl. R. Aghajani), Meta
Meta Technical Report · May 2026
LLM Post-Training
Post-Training LLM Agents via Collaborative Self-Play
New paradigms to improve the capabilities of LLM-based assistant agents by training them in collaborative game settings — covering both social capabilities and steerable clarification policies.
Publications
1
J. Berant, M. Chen, A. Fisch, R. Aghajani, F. Huot, M. Lapata, J. Eisenstein
arXiv:2512.04068 · December 2025
2
Don't Lie to Your Friends: Learning What You Know from Collaborative Self-Play
J. Eisenstein, R. Aghajani, A. Fisch, D. Dua, F. Huot, M. Lapata, V. Zayats, J. Berant
Submitted · arXiv
Reasoning
Enhanced Reasoning Capabilities for AI Assistants
Improved reasoning and tool-use via advanced RL techniques. Designed AI Raters to facilitate RLAIF by providing machine feedback in domains lacking objective ground truth.
GenAI Detection
GenAI Image Detection
Led end-to-end development and launch of a GenAI Image detector using Visual Transformers and Diffusion models. Deployed for Google Image Search and Google Ads Safety to enforce content policies on AI-generated images.
Trust & Safety
Reducing Harmful Misinformation on YouTube
Developed and served Deep Neural Network and NLP-based models to reduce the prevalence of harmful misinformation on YouTube's Recommendation Systems. Designed a novel paradigm for unbiased evaluation of model changes.
Math / Theory
Analysis of Large-Scale Networks in Heavy Traffic
Analyzed load-balancing algorithms for large-scale Stochastic Networks using SPDEs. Solved a 25-year-old open problem on quality-of-service of many-server queueing networks under heavy traffic.
Publications
1
The Limit of Stationary Distributions of Many-Server Queues in the Halfin-Whitt Regime
R. Aghajani, K. Ramanan
Mathematics of Operations Research, Vol. 45, No. 3 · 2020
2
The Hydrodynamic Limit of a Randomized Load Balancing Network
R. Aghajani, K. Ramanan
Annals of Applied Probability, Vol. 29, No. 4 · 2019
3
Ergodicity of an SPDE Associated with a Many-Server Queue
R. Aghajani, K. Ramanan
Annals of Applied Probability, Vol. 29, No. 2 · 2019
4
The PDE Method for the Analysis of Randomized Load Balancing Networks
R. Aghajani, X. Li, K. Ramanan
Proc. ACM Meas. Anal. Comput. Syst., Vol. 1, No. 2 · 2018
5
Large Scale Analysis of Unreliable Stochastic Networks
R. Aghajani, P. Robert, W. Sun
Annals of Applied Probability, Vol. 28, No. 2 · 2018
6
Generalized Majorization-Minimization
S. Parizi, K. He, R. Aghajani, S. Sclaroff, P. Felzenszwalb
ICML 2019

Full list on Google Scholar →

2025 –
Present
Senior Research Scientist
Meta Superintelligence Lab · Menlo Park, CA
Focusing on safety and alignment of advanced agentic AI systems. Researching novel methods for monitoring and controlling autonomous agent behavior through reasoning analysis.
2023 –
2025
Senior Research Scientist
Google DeepMind · San Francisco, CA
Developed LLM-based assistant agents and multimodal GenAI models to empower Google products against misleading information and manipulated media. Foundation Research unit.
2019 –
2023
Senior Software Engineer
YouTube (Google) · San Bruno, CA
Modeling tech-lead in YouTube's Search and Discovery. Developed and deployed SOTA NLP and Vision models to reduce harmful misinformation on YouTube's Recommendation Platform.
2016 –
2019
Visiting Assistant Professor
UC San Diego, Dept. of Mathematics · La Jolla, CA
Research on Stochastic Networks. Teaching Statistics, Probability, and Mathematics of Finance.
Summer
2014
Visiting Researcher
INRIA · Paris, France
Research visit at the Institute for Research in Computer Science.