Back to Claims

Claude Mythos achieved an 82% score on the terminal bench, an increase from the previous 65%.

other
1
Videos
100%
Confidence
4/10/2026
First Seen
4/10/2026
Last Seen
partially true

AI Fact-Check

Multiple tech blogs and analyses from April 2026 confirm the benchmark scores for Anthropic's Claude Mythos Preview. One report explicitly states, "Mythos scores 82.0% on Terminal-Bench 2.0, compared to... Opus 4.6's 65.4%," which supports the first part of the claim. However, while Mythos showed significant improvement on the SWE-bench benchmark, the scores did not nearly double. For instance, its SWE-bench Verified score was 93.9% against the previous model's 80.8%, a substantial but not twofold increase. Context: The claim refers to Claude Mythos Preview, an unreleased AI model by Anthropic announced in early April 2026. Due to its advanced capabilities in discovering and exploiting software vulnerabilities, Anthropic has restricted its access to a consortium of tech companies and security organizations under an initiative called Project Glasswing.

Source Videos (1)

Claude Mythos and the end of software

Theo - t3․gg

5:13
View
"Claude Mythos achieved an 82% score on the terminal bench, an increase from t..." — Unverified | Bullsift