person

Stephen Casper
MIT PhD researcher; red-teaming and model audit
MIT algorithmic alignment researcher focused on red-teaming, auditing, and interpretability. Has documented how safeguards at current frontier labs are reliably broken by determined red-teamers.
current PhD candidate, Algorithmic Alignment Group, MIT CSAIL
Strategy positions
Evals-drivenendorses
Capability/risk evals gate deployment; evals are the load-bearing artefactArgues empirical red-teaming reveals that current safeguards are not robust; auditing must become standard infrastructure.
Example after example of state-of-the-art safeguards get pretty reliably broken. That's the empirical reality.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Stephen Casper's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-25.