person

Asma Ghandeharioun

Google DeepMind; 'Patchscopes' for LLM interpretability

Senior research scientist at Google DeepMind; lead author of Patchscopes, a unifying framework for using language models to inspect their own internal representations.

current Senior Research Scientist, Google DeepMind

@asmadotgh

Strategy positions

Interpretability betendorses

Mechanistic interpretability is necessary and sufficient to know models are safe

Argues language models can be turned into interpretability tools for themselves; reframes mechanistic interpretation as a translation problem between hidden states and natural language.

“Patchscopes leverage the model's own ability to generate text to inspect its hidden representations, unifying many prior interpretability methods.”

§ paperPatchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models· arXiv / Google DeepMind· 2024· direct quote

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Asma Ghandeharioun's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Chris Olah
shared 1 · J=1.00
Anthropic interpretability co-founder; inventor of modern mech interp
Cynthia Rudin
shared 1 · J=1.00
Duke professor; interpretable ML pioneer
David Bau
shared 1 · J=1.00
Northeastern; mechanistic interpretability of LLMs
Fernanda Viégas
shared 1 · J=1.00
Harvard; ex-Google PAIR; data visualization
Jacob Andreas
shared 1 · J=1.00
MIT NLP; language models as belief reports
John Wentworth
shared 1 · J=1.00
Independent alignment researcher; natural abstractions

Record last updated 2026-04-25.