strategy tag
Agent foundations.
Reformulate decision theory and embedded agency before behavioural training can be trusted
also known as: embedded agency
People on the record.
2Abram Demski
MIRI researcher; embedded agency
Argues that 'embedded' reasoning, where an agent is a physical computation inside its world rather than a Cartesian observer, is the root of unresolved alignment problems.
Classical decision theory implicitly assumes a Cartesian boundary between agent and environment. Embedded agents lack this boundary, and that is where most alignment problems hide.
Scott Garrabrant
MIRI researcher; logical induction, Cartesian frames
Argues robust alignment requires foundational work on what agency, embeddedness, and self-reference even mean before behavioral training methods can be trusted.
Logical inductors approximate well-calibrated probabilistic reasoning over computable statements: the bookmaker is gradually unable to extract money from the market over time.