AGI Strategies

person

Gabriel Mukobi

Stanford alignment researcher

Stanford master's student turned alignment researcher; co-author of Cicero (Meta's negotiation AI) and of multiple safety-evaluation papers.

current Researcher, Stanford University

Strategy positions

Evals-drivenendorses

Capability/risk evals gate deployment; evals are the load-bearing artefact

Argues empirical evaluations of advanced AI behaviour, particularly around deception and strategic reasoning, are the surest way to reveal capability progress that matters for safety.

Cicero shows that human-level negotiation is achievable today. The next question is whether the same techniques produce systems that strategically deceive humans, and how we would tell.
articleGabriel Mukobi, Stanford· gmukobi.com· 2023· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Gabriel Mukobi's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

  • Aleksander Mądry

    shared 1 · J=1.00

    MIT; ex-OpenAI head of preparedness

  • Alex Meinke

    Alex Meinke

    shared 1 · J=1.00

    Apollo Research; deceptive alignment evaluations

  • Ali Rahimi

    Ali Rahimi

    shared 1 · J=1.00

    Google Brain ML researcher; 'Alchemy' speech

  • Anna Rogers

    Anna Rogers

    shared 1 · J=1.00

    IT University of Copenhagen; LLM benchmarking critique

  • Arati Prabhakar

    Arati Prabhakar

    shared 1 · J=1.00

    White House OSTP director (2022–2025)

  • Beth Barnes

    Beth Barnes

    shared 1 · J=1.00

    Founder of METR; dangerous capability evaluations

Record last updated 2026-04-25.