person
Fabien Roger
Anthropic alignment researcher; control evaluations
Anthropic alignment researcher whose work on AI control, designing evaluations to test whether models can subvert oversight even when they are trying to, has been widely cited in safety circles.
current Member of Technical Staff, Anthropic
Strategy positions
Evals-drivenendorses
Capability/risk evals gate deployment; evals are the load-bearing artefactArgues control evaluations, stress testing whether AIs can subvert their own monitoring, are a load-bearing part of any sensible deployment regime.
AI control is the discipline of designing protocols that catch a model trying to subvert oversight, even when the model is much more capable than its monitors at the relevant tasks.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Fabien Roger's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-25.