AGI Strategies

person

Nicholas Carlini

Nicholas Carlini

Anthropic adversarial-ML researcher; ex-Google Brain

Adversarial machine learning researcher widely cited for memorization, jailbreak, and privacy attacks on large models. Argues current LLM safety is unusually fragile compared to mature security fields.

current Member of Technical Staff, Anthropic
past Research Scientist, Google Brain / DeepMind (2018–2024)

Strategy positions

Security mindsetendorses

Treat safety as adversarial security; assume systems break under attack

Argues ML systems are routinely broken by simple attacks and that the field treats safety claims with insufficient adversarial scrutiny.

I think the difficulty of attacking machine learning models is grossly overestimated, and the difficulty of defending them grossly underestimated.
blogWhy I attack neural networks· nicholas.carlini.com· 2024· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Nicholas Carlini's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

  • Daniel Kang

    shared 1 · J=1.00

    UIUC; LLM agents and AI security

  • Nicolas Papernot

    shared 1 · J=1.00

    U Toronto / Vector Institute; ML privacy and security

  • Riley Goodside

    shared 1 · J=1.00

    Scale AI; prompt engineering pioneer

  • Simon Willison

    Simon Willison

    shared 1 · J=1.00

    Independent developer; co-creator of Django; LLM tools

  • Vitaly Shmatikov

    shared 1 · J=1.00

    Cornell Tech; ML privacy and security

Record last updated 2026-04-25.