ai side lever
Control mechanism
How AI is kept predictable.
↑ Add mechanism
AI containment
Control mechanism ↑Useful AI does not require unrestricted actuation; strong capability in a contained system is better than limited capability uncontained.
AI for safety
Control mechanism ↑The same capability that makes AI dangerous makes it uniquely useful for automating alignment research and oversight.
Alignment first
Control mechanism ↑Technical alignment is solvable before critical capability thresholds close, and aligned systems compose safely into aligned populations.
Counter AI AI
Control mechanism ↑AI attacks happen at speeds humans cannot observe; defence must happen in AI, with guardian AI systems continuously evaluating adversary AI.
Interpretability first
Control mechanism ↑Mechanistic understanding is a precondition for reliable oversight; behavioural evaluation without interpretability cannot rule out deceptive alignment.
Safe by construction AI
Control mechanism ↑Safety is a property that can be mathematically specified and mechanically verified for the class of systems being built.
• Neutral
Confucian role ethics
Control mechanism •Western alignment assumes isolable preferences can be learned and matched; role ethics treats behaviour via fit with position and relationship, producing a less brittle, more context-sensitive standard.
Dharma conformity
Control mechanism •Alignment frames AI as tool for an external principal; a dharma frame treats AI as a type of entity whose safety is conformity to its fitting functions.
Reframe AI
Control mechanism •The dominant alignment frame produces the wrong problem statement; switching frames either dissolves the problem or recasts it as tractable.