person

Buck Shlegeris

CEO of Redwood Research; 'AI Control' research lead

Researcher behind the 'AI control' research agenda: designing protocols that remain safe even if the AI being supervised is scheming. Frames safety as a defence problem that can be solved by cheaper means than alignment proper.

current CEO, Redwood Research

homepage @bshlgrs

Profile

expertise

Deep technical

Sustained peer-reviewed contribution to ML, alignment, interpretability, or safety techniques. Could review a frontier paper.

CEO of Redwood Research. Active publisher on AI control, scheming, scalable oversight.

recognition

Established

Reliable, recognised voice within their specific subfield. Cited and invited but not central to general AI discourse.

Recognised inside the alignment community.

vintage

Scaling era

Worldview formed during GPT-2/3, scaling laws, Anthropic's founding. Pre-ChatGPT but post-deep-learning. The 'scale is all you need' debate is live.

MIRI early 2010s; Redwood Research founded 2021. AI control work matures in scaling era.

Hand-classified. See the board for the criteria and the full grid.

Strategy positions

Alignment firstmixed

Solve technical alignment before capability thresholds close

Advocates 'AI control', protocol-level safety assuming the worst about model intentions, as a complement to direct alignment.

We should design AI deployment protocols that remain safe even if our AIs are trying to subvert them.

§ paperThe case for ensuring that powerful AIs are controlled· AI Alignment Forum· 2023-12· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Buck Shlegeris's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Aaron Courville
shared 1 · J=1.00
Université de Montréal; Deep Learning textbook co-author
Adam Jermyn
shared 1 · J=1.00
Anthropic; previously astrophysics
Adam Kalai
shared 1 · J=1.00
Microsoft Research; AI fairness and safety
Agnes Callard
shared 1 · J=1.00
University of Chicago philosopher; aspiration theorist
Ajeya Cotra
shared 1 · J=1.00
Open Philanthropy researcher; 'biological anchors' forecaster
Alan Turing
shared 1 · J=1.00
Founder of theoretical computer science (1912–1954)

Record last updated 2026-04-24.