person

Karthik Narasimhan

Princeton; reasoning, NLP

Princeton assistant professor of computer science. Previously a researcher at OpenAI before its scaling era. Focuses on language understanding and reasoning, and led work on the SWE-bench benchmark.

current Assistant Professor of Computer Science, Princeton University

homepage @karthik_r_n

Strategy positions

Evals-drivenendorses

Capability/risk evals gate deployment; evals are the load-bearing artefact

Argues evaluations grounded in real-world software engineering tasks reveal capability and safety properties that synthetic benchmarks miss.

SWE-bench evaluates language models in a realistic software engineering setting: resolving real GitHub issues from real codebases. Performance here is closer to deployment reality than synthetic tasks.

§ paperSWE-bench: Can Language Models Resolve Real-World GitHub Issues?· arXiv· 2023-10· faithful paraphrase

Closest strategy neighbours

by jaccard overlap

Other people whose strategy tags overlap with Karthik Narasimhan's. Overlap is on tag identity, not stance; opposites can show up if they reference the same tags.

Aleksander Mądry
shared 1 · J=1.00
MIT; ex-OpenAI head of preparedness
Alex Meinke
shared 1 · J=1.00
Apollo Research; deceptive alignment evaluations
Ali Rahimi
shared 1 · J=1.00
Google Brain ML researcher; 'Alchemy' speech
Anna Rogers
shared 1 · J=1.00
IT University of Copenhagen; LLM benchmarking critique
Arati Prabhakar
shared 1 · J=1.00
White House OSTP director (2022–2025)
Beth Barnes
shared 1 · J=1.00
Founder of METR; dangerous capability evaluations

Record last updated 2026-04-25.