person

Jeffrey Ladish
Executive Director of Palisade Research
AI security researcher who demonstrates how easily safety fine-tuning can be removed from open-weight models. Advocates for strict controls on frontier model distribution and for treating weights as hazardous.
Profile
expertise
Applied technical
Technical fluency from an adjacent field (security, robotics, formal methods, statistics) or applied AI work, but not on frontier loss curves or core ML theory.
Director of Palisade Research. Security background; runs frontier-model offensive-cyber evals. Practitioner, not core ML researcher.
recognition
Established
Reliable, recognised voice within their specific subfield. Cited and invited but not central to general AI discourse.
Recognised in safety/policy community for Palisade demonstrations.
vintage
Scaling era
Worldview formed during GPT-2/3, scaling laws, Anthropic's founding. Pre-ChatGPT but post-deep-learning. The 'scale is all you need' debate is live.
Anthropic security 2021–2023, Palisade Research 2023. Career maps to scaling-era safety operationalisation.
Hand-classified. See the board for the criteria and the full grid.
Strategy positions
Closed weightsendorses
Keep frontier weights closed; treat them as hazardous artefactsEmpirically demonstrates removing RLHF safety from open-weight models, arguing that releasing weights is irreversible proliferation.
Releasing the weights of a frontier model is a one-way operation. You cannot uninvent a weapon.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Jeffrey Ladish's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-24.