person

Beth Barnes

Founder of METR; dangerous capability evaluations

Formerly at ARC Evals; now runs METR, which designs and runs frontier model evaluations for dangerous capabilities. Central figure in the evals-driven governance ecosystem.

current Founder and CEO, METR (Model Evaluation & Threat Research)

homepage @bethmaybarnes

Profile

expertise

Deep technical

Sustained peer-reviewed contribution to ML, alignment, interpretability, or safety techniques. Could review a frontier paper.

Founder/CEO of METR (Model Evaluation & Threat Research). Designed many frontier-model dangerous-capability evals. Earlier OpenAI alignment.

recognition

Field-leading

Widely known inside the AI and AI-safety community. Appears repeatedly in top venues, podcasts, or governance forums. Not a household name to outsiders.

Recognised in evals and safety circles. Less mainstream press.

vintage

Scaling era

Worldview formed during GPT-2/3, scaling laws, Anthropic's founding. Pre-ChatGPT but post-deep-learning. The 'scale is all you need' debate is live.

Earlier OpenAI alignment ~2020. Founded METR 2022. Career maps to the scaling-era evals push.

Hand-classified. See the board for the criteria and the full grid.

Strategy positions

Evals-drivenendorses

Capability/risk evals gate deployment; evals are the load-bearing artefact

Designs autonomous-task evaluations that labs and governments rely on to gauge whether models cross dangerous thresholds.

If we are going to trust safety commitments, we need evaluations that are independent, reproducible, and well-funded.

¶ articleMETR, About· METR· 2024· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Beth Barnes's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Aleksander Mądry
shared 1 · J=1.00
MIT; ex-OpenAI head of preparedness
Alex Meinke
shared 1 · J=1.00
Apollo Research; deceptive alignment evaluations
Ali Rahimi
shared 1 · J=1.00
Google Brain ML researcher; 'Alchemy' speech
Anna Rogers
shared 1 · J=1.00
IT University of Copenhagen; LLM benchmarking critique
Arati Prabhakar
shared 1 · J=1.00
White House OSTP director (2022–2025)
Bo Li
shared 1 · J=1.00
UChicago / UIUC; AI safety evaluations

Record last updated 2026-04-24.