person
Paul Christiano
Founder of the US AI Safety Institute safety team; ex-OpenAI alignment lead
Key architect of RLHF and of much of modern alignment theory. Founded the Alignment Research Center; now runs safety at the US AI Safety Institute inside NIST. Publicly estimates ~46% chance of doom.
Profile
expertise
Frontier builder
Currently or recently led training, architecture, or safety work on a frontier model. Hands on the loss curve.
Invented RLHF (the technique behind ChatGPT instruct-tuning). Founded Alignment Research Center. Now heads safety at the US AI Safety Institute (NIST).
recognition
Field-leading
Widely known inside the AI and AI-safety community. Appears repeatedly in top venues, podcasts, or governance forums. Not a household name to outsiders.
Universal name in AI safety research. NIST appointment got policy-press coverage. Not a household name.
vintage
Scaling era
Worldview formed during GPT-2/3, scaling laws, Anthropic's founding. Pre-ChatGPT but post-deep-learning. The 'scale is all you need' debate is live.
Joined OpenAI 2017; introduced RLHF 2017. ARC 2021. His career is scaling-era applied alignment.
Hand-classified. See the board for the criteria and the full grid.
p(doom)
- 46%2023-04-27
Definition used: Approximately 46% chance of an extremely bad outcome, in his LessWrong post decomposing takeover and non-takeover catastrophes.
My views on doom · LessWrong
Strategy positions
Alignment firstendorses
Solve technical alignment before capability thresholds closeCanonical modern alignment researcher; works on debate, RLHF, and eliciting latent knowledge.
I'd guess something like a 20% chance of an AI takeover, with many of the humans dead, and a further 30% chance or so of serious irreversible problems short of takeover.
Context: From his LessWrong post 'My views on doom'.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Paul Christiano's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-24.