person

Nicholas Carlini

Anthropic adversarial-ML researcher; ex-Google Brain

Adversarial machine learning researcher widely cited for memorization, jailbreak, and privacy attacks on large models. Argues current LLM safety is unusually fragile compared to mature security fields.

current Member of Technical Staff, Anthropic

past Research Scientist, Google Brain / DeepMind (2018–2024)

homepage

Strategy positions

Security mindsetendorses

Treat safety as adversarial security; assume systems break under attack

Argues ML systems are routinely broken by simple attacks and that the field treats safety claims with insufficient adversarial scrutiny.

I think the difficulty of attacking machine learning models is grossly overestimated, and the difficulty of defending them grossly underestimated.

✍ blogWhy I attack neural networks· nicholas.carlini.com· 2024· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Nicholas Carlini's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Daniel Kang
shared 1 · J=1.00
UIUC; LLM agents and AI security
Nicolas Papernot
shared 1 · J=1.00
U Toronto / Vector Institute; ML privacy and security
Riley Goodside
shared 1 · J=1.00
Scale AI; prompt engineering pioneer
Simon Willison
shared 1 · J=1.00
Independent developer; co-creator of Django; LLM tools
Vitaly Shmatikov
shared 1 · J=1.00
Cornell Tech; ML privacy and security

Record last updated 2026-04-25.