Strategy positions Capability/risk evals gate deployment; evals are the load-bearing artefact Measures empirical trends in autonomous-task capability as the quantitative backbone of deployment-risk reasoning.
The length of autonomous tasks frontier models can complete has been roughly doubling every 4 to 7 months. ¶ article METR capability evaluations · METR · 2025 · faithful paraphrase
Closest strategy neighbours by jaccard overlap Other people whose strategy tags overlap with Michael Chen's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
AM ale AM ale Aleksander Mądry
MIT; ex-OpenAI head of preparedness
Evals-driven
Aleksander Mądry
AM ale Aleksander Mądry
MIT; ex-OpenAI head of preparedness
Evals-driven
shared 1 · J=1.00
MIT; ex-OpenAI head of preparedness
Alex Meinke
Apollo Research; deceptive alignment evaluations
Evals-driven
Alex Meinke
Alex Meinke
Apollo Research; deceptive alignment evaluations
Evals-driven
shared 1 · J=1.00
Apollo Research; deceptive alignment evaluations
Ali Rahimi
Google Brain ML researcher; 'Alchemy' speech
Evals-driven
Ali Rahimi
Ali Rahimi
Google Brain ML researcher; 'Alchemy' speech
Evals-driven
shared 1 · J=1.00
Google Brain ML researcher; 'Alchemy' speech
Anna Rogers
IT University of Copenhagen; LLM benchmarking critique
Evals-driven
Anna Rogers
Anna Rogers
IT University of Copenhagen; LLM benchmarking critique
Evals-driven
shared 1 · J=1.00
IT University of Copenhagen; LLM benchmarking critique
Arati Prabhakar
White House OSTP director (2022–2025)
Evals-driven
Arati Prabhakar
Arati Prabhakar
White House OSTP director (2022–2025)
Evals-driven
shared 1 · J=1.00
White House OSTP director (2022–2025)
Beth Barnes
Founder of METR; dangerous capability evaluations
Deep technical · Field-leading · Scaling era
Evals-driven
Beth Barnes
Beth Barnes
Founder of METR; dangerous capability evaluations
Deep technical · Field-leading · Scaling era
Evals-driven
shared 1 · J=1.00
Founder of METR; dangerous capability evaluations
Record last updated 2026-04-25.