overview

A map of AI safety strategies.

Each strategy is a bet about which failure mode binds, which one actually gates a good outcome. The survey catalogues 76 named bets, the 15 levers they pull, and which combinations compose or conflict.

Two strategies conflict only when they pull the same lever in opposite directions, which is rare. Most pairs compose. Most public proposals combine three or four levers without stating which bet is load-bearing; the portfolio audit exposes that concealed concentration.

the field, every strategy, plotted

76 strategies · 15 levers

Speed↕

Concentration↕

Control mechanism

Institutional capacity

Cooperation substrate

pulls down (negative direction)neutral or frame-rejectingpulls up (positive direction)

Each dot is one strategy. Rows are levers. A lever with dots on both sides is a real conflict surface; any portfolio with strategies from both sides contradicts itself on that lever. Lonely dots name under-explored positions. Click any dot to open the strategy.

Strategies catalogued

each a bet about what binds

Levers they pull

of 15 distinct types

Conflict pairs

across 6 levers with real two-sided pull

World-side strategies

33%

act on institutions, culture, substrate, not the model

Total unordered pairs

2,850

most compose; few actually conflict

What's here.

Seven ways to enter the survey. Start where the question is yours.

Start from a failure mode

Pick a concrete fear, decisive advantage, eval abandonment, accumulative erosion, and see which strategies are responsive.

The levers

Browse by lever. See which are crowded, which are thin.

Combination matrix

76²

Every pair, named by mechanism. Lever opposition, cross-side bridge, stage-sequenced.

Portfolio audit

tool

Load a proposal. See its lever footprint and hidden concentration.

Compare two

A↔B

Pick two strategies. See who endorses each, the tier mix of endorsers, and where the disagreement lives.

Co-endorsement

data

Strategy pairs the same people actually hold together. Behavioural data, not declared rules. Shows where the catalogue and the corpus disagree.

Axes and density

5+1

Strategies mapped across actor, coercion, horizon, legitimacy. Density map for sparse vs saturated regions.

Commitments

7 topics

The worldview assumptions each strategy quietly rests on. What fails if the assumption is wrong.

Bet or identity?

diag

Diagnostic for telling whether a strategy is held as a falsifiable bet or as an identity marker. Includes the catalogued falsification signals.

A strategy page

per

The artefact for each strategy: bet, mechanism, successor, falsification, commitments, scenarios addressed, mechanism-grouped relations.

a walking tour

If you want one path through the survey.

01Start with a failure mode you actually fear, pick one at scenarios. See which strategies are catalogued as responsive.
02Open the top candidate. Read the bet, mechanism, and what binds next if it succeeds. Does its successor problem scare you more than the original?
03Check the falsification signal and self-undermining threshold. Would the advocate community update if the signal landed? Where does pursuit overshoot into the unstable region?
04Walk the complements by mechanism. Cross-side bridges reduce lever concentration; stage-sequenced pairs extend time coverage.
05Return here, load the portfolio you are building into the audit. See which levers it misses and which strategies double-count.

Where the consensus lives.

the board →

For each strategy with at least four profiled endorsers, who actually holds it. A strategy held mostly by frontier-builders is in a different epistemic state from one held mostly by commentators. Counts are over the 9 strategies that meet the bar.

Builder-heavy

Endorsement is ≥60% frontier-builder + deep-technical

Policy-heavy

Endorsement is ≥40% policy / meta

Commentary-heavy

Endorsement is ≥50% commentator + external-domain

By lever

lever reference →

Each lever is a kind of action a strategy takes. Strategies grouped on the same lever either reinforce (same direction) or conflict (opposite direction).