strategy tag

Agent foundations.

Reformulate decision theory and embedded agency before behavioural training can be trusted

also known as: embedded agency

People on the record.

Abram Demski

MIRI researcher; embedded agency

endorses

Argues that 'embedded' reasoning, where an agent is a physical computation inside its world rather than a Cartesian observer, is the root of unresolved alignment problems.

Classical decision theory implicitly assumes a Cartesian boundary between agent and environment. Embedded agents lack this boundary, and that is where most alignment problems hide.

✍ blogEmbedded Agency· AI Alignment Forum· 2018· faithful paraphrase

Scott Garrabrant

MIRI researcher; logical induction, Cartesian frames

endorses

Argues robust alignment requires foundational work on what agency, embeddedness, and self-reference even mean before behavioral training methods can be trusted.

Logical inductors approximate well-calibrated probabilistic reasoning over computable statements: the bookmaker is gradually unable to extract money from the market over time.

§ paperLogical Induction· MIRI· 2016· faithful paraphrase

People on the record.

Abram DemskiADabrAbram DemskiMIRI researcher; embedded agencyAgent foundations

Scott GarrabrantSGscoScott GarrabrantMIRI researcher; logical induction, Cartesian framesAgent foundations

Abram Demski

Scott Garrabrant