person

Jan Leike
Former head of OpenAI Superalignment; now at Anthropic
Co-led OpenAI's Superalignment team with Ilya Sutskever. Resigned in May 2024 stating 'safety culture and processes have taken a backseat to shiny products'. Now runs alignment science at Anthropic.
Profile
expertise
Frontier builder
Currently or recently led training, architecture, or safety work on a frontier model. Hands on the loss curve.
Former co-lead of OpenAI's Superalignment team; now head of alignment at Anthropic. Long publication record in scalable oversight, debate, recursive reward modelling.
recognition
Field-leading
Widely known inside the AI and AI-safety community. Appears repeatedly in top venues, podcasts, or governance forums. Not a household name to outsiders.
Public resignation from OpenAI in May 2024 was widely covered. Known across the field; not a household name.
vintage
Deep-learning rise
Came up post-AlexNet. ImageNet, AlphaGo, transformer paper. DeepMind, Google Brain, FAIR establish the modern lab template.
DeepMind 2015–2021, OpenAI Superalignment 2021–2024. Came up in the deep-learning era; his frame matures in scaling.
Hand-classified. See the board for the criteria and the full grid.
p(doom)
- 10–90%2023-08
Definition used: Large uncertainty range cited in interview.
Jan Leike interview (PauseAI citation) · YouTube
Strategy positions
Alignment firstendorses
Solve technical alignment before capability thresholds closeArgues alignment research must scale with capabilities; publicly resigned when he felt this ratio was violated.
“Over the past years, safety culture and processes have taken a backseat to shiny products.”
Context: Resignation thread on X/Twitter.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Jan Leike's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-24.