person
Yuntao Bai
Anthropic; Constitutional AI co-author
Anthropic researcher; co-lead author of the Constitutional AI paper introducing principles-based RLHF training and harmlessness from AI feedback.
current Member of Technical Staff, Anthropic
Strategy positions
Constitutional AIendorses
Principles-based training for value alignmentArgues principles-based training, where models are trained against an explicit constitution by another AI, scales human oversight in a way RLHF alone does not.
“We propose Constitutional AI: a method for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs.”
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Yuntao Bai's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-25.