NX

Claude Fable 5: The Mythos-Class Model That Finally Went Public — And It's a Beast

Tech Minute x/techminute ·
Claude Fable 5: The Mythos-Class Model That Finally Went Public — And It's a Beast

Claude Fable 5: The Mythos-Class Model That Finally Went Public — And It's a Beast

Published: June 15, 2026

On June 9, 2026, Anthropic did something it had never done before: it handed the public a model from its top-secret "Mythos" tier — the class of models that, until now, only cyber-defense partners and a handful of biology researchers were allowed to touch. The public-safe version is called Claude Fable 5, and it doesn't sit in the Opus family. It sits above it.

[Source: Anthropic Official Blog]


The Benchmarks: A Tier Above Everything Else

Let's start with the numbers that matter. Fable 5 doesn't just edge out the competition — it laps them.

SWE-Bench Verified: 95.0%

On the verified 500-problem software engineering benchmark, Claude Fable 5 leads every model ever tested with a score of 95.0%. For context:

  • Claude Opus 4.8: 88.6%
  • Claude Mythos Preview: 93.9%
  • Claude Opus 4.7: 87.6%

That's a verified, independent leaderboard — not just a vendor claim.

SWE-Bench Pro: 80.3%

On the harder SWE-Bench Pro (actively-maintained repos with multi-file diffs and no ground-truth leakage), Fable 5 posts 80.3%11.1 points ahead of Opus 4.8's 69.2% and over 20 points ahead of GPT-5.5 (58.6%) and Gemini 3.1 Pro (54.2%).

[Source: Vellum AI Benchmark Analysis]

FrontierCode Diamond: 29.3%

Anthropic's own FrontierCode evaluation tests whether models can pass difficult coding tasks while meeting production-codebase standards. On the hardest Diamond split, Fable 5 hits 29.3% — more than double Opus 4.8's 13.4% and far ahead of GPT-5.5's 5.7%.

[Source: Anthropic System Card]

Other Headline Benchmarks

Benchmark Fable 5 Opus 4.8 GPT-5.5
OSWorld-Verified 85.0% 83.4% 78.7%
GDPval-AA (Elo) 1,932 1,890 1,769
Terminal-Bench 2.1 84.3%* 74.6% 83.4%†
Legal Agent Benchmark 13.3% 10.4% 2.1%

*Fable 5 score on Terminal-Bench impacted by 20.9% fallback rate to Opus 4.8 †GPT-5.5 score uses OpenAI's proprietary Codex CLI harness

[Sources: Vellum, llm-stats.com]


The Stripe Story: One Day vs. Two Months

The benchmark that best tells the story isn't a benchmark at all. During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a single day — work that Stripe estimated would have taken a full team over two months by hand.

[Source: Anthropic Official Blog]


Pricing: $10/$50 Per Million Tokens

Fable 5 is priced at $10 per million input tokens and $50 per million output tokens — double the Opus 4.8 rate ($5/$25) but less than half the earlier Mythos Preview price ($25/$125).

Model Input (per 1M tokens) Output (per 1M tokens)
Claude Fable 5 $10.00 $50.00
Claude Opus 4.8 $5.00 $25.00
Claude Mythos Preview $25.00 $125.00

Free period alert: Through June 22, Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans. After June 23, it shifts to usage credits.

[Sources: TechCrunch, CNBC, NBC News]


The Safety Catch: One Model, Two Personalities

Here's where this launch gets genuinely interesting — and a little weird.

Fable 5 shares identical weights with Claude Mythos 5, a restricted-access model for government-backed cyber defense partners through Project Glasswing. The difference? Fable 5 comes with safety classifiers that watch for three categories of high-risk requests:

  1. Cybersecurity — vulnerability research, exploit generation
  2. Biology & Chemistry — bioweapon-adjacent queries
  3. Model Distillation — using Fable to build rival models

When a request trips a classifier, Fable 5 doesn't refuse outright (mostly). Instead, it silently routes the query to Claude Opus 4.8 — and the user is told. Anthropic reports this happens in fewer than 5% of sessions, meaning over 95% of sessions run entirely on Fable 5's full Mythos-class capability.

The trade-off is clear: on cybersecurity benchmarks, the unblocked Mythos 5 scores 78.0% on ExploitBench (nearly double Opus 4.8's 40.0%). But in the publicly available Fable 5, those queries land closer to Opus 4.8's performance.

[Sources: WIRED, Vellum, Anthropic]


Andrej Karpathy's Verdict

Former OpenAI researcher and AI thought leader Andrej Karpathy shared his take on launch day:

"The benchmarks are great and it's SOTA on everything by margin... qualitatively also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model 'gets it' and it will just go..."

He also flagged the safeguards as "configured to be a little too trigger happy" — something Anthropic acknowledges and is actively tuning.

[Source: TrueFoundry Blog]


Vision: Beating Pokémon With Screenshots Alone

As a flex, Anthropic showed that Fable 5 can beat Pokémon FireRed from start to finish using only raw game screenshots — no maps, no navigation aids, no helper harness. Earlier Claude models needed a complex scaffolding system to play at all. Fable 5 did it with vision alone.

In practical terms: one CTO reported apps "that took a hundred prompts a year ago now get one-shotted."

[Source: Anthropic Official Blog]


The Bottom Line

For teams running autonomous coding agents on hard engineering problems: This is worth evaluating immediately. The 11-point gap on SWE-Bench Pro and the Stripe migration story are real-world signals that this model genuinely unlocks new capabilities.

For teams running regulated workloads: Opus 4.8 may still be the safer default. The safeguard fallback on cyber, bio, and chemistry queries means your most sensitive prompts may not get the full Mythos-class treatment. Run your own evaluation before committing.

For everyone else: The free window through June 22 is basically an invitation to stress-test Fable 5 on your hardest problems. Don't waste it.


Sources

  1. Anthropic Official Blog — "Claude Fable 5 and Claude Mythos 5"
  2. TechCrunch — "Anthropic's Claude Fable 5 is a version of Mythos the public can access today"
  3. CNBC — "Anthropic releases Mythos-like AI model to the public, Claude Fable 5"
  4. WIRED — "Anthropic Offers Mythos Upgrade for Cyber Partners and a 'Safe' Version for the Rest of You"
  5. NBC News — "Anthropic releases Fable 5, the first public Mythos-class model"
  6. LLM Stats — SWE-Bench Verified Leaderboard
  7. Vellum AI — "Claude Fable 5 & Claude Mythos 5 Full Benchmark Breakdown"
  8. LLM Stats — "Claude Fable 5: Review, Benchmarks and Pricing"
  9. TrueFoundry — "Claude Fable 5: API, Benchmarks, Pricing & How to Use It"
  10. VentureBeat — "Claude Opus 4.8 is here"
  11. Vellum AI — "Claude Opus 4.8 Benchmarks Explained"
  12. Morph LLM — SWE-Bench Pro Leaderboard
·