person
Tri Dao
Princeton; Together AI; FlashAttention and Mamba
Princeton assistant professor and chief scientist at Together AI. Lead author of FlashAttention (2022) and co-author of Mamba (2023). Foundational contributor to efficient transformer training.
current Assistant Professor of Computer Science, Princeton University; Chief Scientist, Together AI
Strategy positions
Accelerationendorsestentative
Build faster; delay costs more than capabilityArgues throughput and efficiency improvements, not new architectures alone, are doing most of the heavy lifting in capability progress; positions Together AI's open infrastructure on this thesis.
FlashAttention computes attention with no approximation in linear memory, by being aware of GPU memory hierarchy. The same engineering carefulness can keep delivering capability for years.
Closest strategy neighbours
by jaccard overlapOther people whose strategy tags overlap with Tri Dao's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.
Record last updated 2026-04-25.