person

Gabriel Mukobi

Stanford alignment researcher

Stanford master's student turned alignment researcher; co-author of Cicero (Meta's negotiation AI) and of multiple safety-evaluation papers.

current Researcher, Stanford University

@gabe_mukobi

Strategy positions

Evals-drivenendorses

Capability/risk evals gate deployment; evals are the load-bearing artefact

Argues empirical evaluations of advanced AI behaviour, particularly around deception and strategic reasoning, are the surest way to reveal capability progress that matters for safety.

Cicero shows that human-level negotiation is achievable today. The next question is whether the same techniques produce systems that strategically deceive humans, and how we would tell.

¶ articleGabriel Mukobi, Stanford· gmukobi.com· 2023· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Gabriel Mukobi's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Aleksander Mądry
shared 1 · J=1.00
MIT; ex-OpenAI head of preparedness
Alex Meinke
shared 1 · J=1.00
Apollo Research; deceptive alignment evaluations
Ali Rahimi
shared 1 · J=1.00
Google Brain ML researcher; 'Alchemy' speech
Anna Rogers
shared 1 · J=1.00
IT University of Copenhagen; LLM benchmarking critique
Arati Prabhakar
shared 1 · J=1.00
White House OSTP director (2022–2025)
Beth Barnes
shared 1 · J=1.00
Founder of METR; dangerous capability evaluations

Record last updated 2026-04-25.