Anthropic ships Mythos 5: the frontier moves up a tier
Claude Fable 5 lands as the public version of Mythos — top of every benchmark, double the price, and built to run for half a day on its own..
On 9 June 2026, Anthropic released two new versions of Claude. Mythos 5 is the most capable model the company has ever built. Fable 5 is the same model wrapped in extra safety filters, sold to anyone with a credit card. Mythos itself stays gated to vetted enterprise customers under a programme called Project Glasswing. The pitch is simple: this is the first model from the “Mythos” generation — a step above OpenAI’s GPT-5.5 and Anthropic’s own Opus 4.8 — and it can run for hours on a single instruction, writing code, doing research and checking its own work. It costs roughly twice as much per token as Opus. Anthropic also launched Managed Agents, a cloud service that runs these long-running tasks on its own infrastructure. The release is the most significant procurement decision facing enterprise AI buyers this summer.
Ethan Mollick, the Wharton professor who has spent two years documenting how knowledge workers actually use AI, opened his laptop in early June with early access to Fable 5 and an unusually open-ended brief. Build an isochrone map of Philadelphia commute times. Draft a piece of academic social-science research from a single prompt. Build a small browser game. Then, the test that has become his signature: hand it a multi-page specification and walk away. Twelve hours later — Mollick’s phrase was somewhere between “delightful and unnerving” — the model returned a finished artefact, with sub-agents dispatched, tests written, and outputs visually checked against the original goal. “It represents a very real leap over every model I have used before,” he wrote in One Useful Thing. Alberto Romero, writing in The Algorithmic Bridge, was blunter: by the numbers, Mythos 5 is the best AI model in the world. The launch itself was choreographed. Anthropic’s blog post went live on 9 June alongside a 319-page system card. Dario Amodei, who in April had withheld the Mythos Preview because of cybersecurity concerns, framed the public release as a controlled retreat from caution: capabilities, he wrote, “exceed those of every model we have previously made generally available.” The structure is the news. Mythos 5 — raw, expensive, enterprise-only — is the underlying intelligence. Fable 5 is Mythos with four classifier layers bolted on top: cyber, bio/chem, distillation, and a special block on requests related to frontier model training. When a classifier fires, the response is silently routed to Opus 4.8 instead. Anthropic says this happens in fewer than five percent of sessions. Pricing tells the rest of the story. Fable 5 and Mythos 5 both cost $10 per million input tokens and $50 per million output tokens — exactly double Opus 4.8, but, notably, less than half the price of the Mythos Preview that leaked at $80 output earlier this spring. The model is available on Anthropic’s API, Claude Code, GitHub Copilot, AWS Bedrock, Google Cloud Vertex and Microsoft Foundry from day one. Stripe used it to migrate a 50-million-line Ruby codebase “in a day,” a number that landed in the press kit because it is the kind of claim CFOs remember. Not by accident, the launch came bundled with Managed Agents, a sandboxed runtime priced at $0.08 per active session-hour on top of token costs — Anthropic’s answer to OpenAI’s own agent infrastructure and, more quietly, to its own customers who have started running Claude in fifteen-hour loops without supervision.
The benchmark sheet is the most lopsided Anthropic has shipped. On SWE-Bench Pro, the hardened version of the standard software-engineering test, Fable 5 scores 80.3 percent. Opus 4.8 sits at 69.2. GPT-5.5 trails at 58.6. On Cognition’s FrontierCode Diamond benchmark, which scores agentic coding for quality and maintainability rather than just passing tests, Fable 5 hits 29.3 percent against Opus at 13.4 and GPT-5.5 at 5.7 — a five-times multiplier over OpenAI’s frontier model on the test that arguably matters most for enterprise deployment. On Hex’s analytical benchmark, Fable 5 became the first model to clear 90 percent. On GDPval-AA, a measure designed to approximate economic value of model output, it scores 1932; Opus 4.8 manages 1890. On GDPpdf, a vision benchmark over real corporate documents, Fable 5 scores 29.8 percent without tools, against Opus at 22.5. The historical comparison is the one worth holding in mind. When GPT-4 launched in March 2023, it cleared roughly 50 percent on the original SWE-Bench — a result that read at the time as miraculous. Three years later, Fable 5 clears 80 percent on the harder version while running unattended overnight on a 1-million-token context window with up to 128k tokens of output. The trajectory has not flattened. It has, by Anthropic’s preferred metric, accelerated. Token intensity is the design choice underneath. Mythos-class models are explicitly built to think for longer; that is what justifies the price and what makes them economically irrational for short queries. Romero’s point in The Algorithmic Bridge — that you cannot taste the difference without spending real money on long, hard tasks — is now Anthropic’s pricing strategy translated into customer education. The catch is the gap between Mythos and Fable. Buyers of the cheaper, classified version are paying frontier money for a model whose responses, in roughly one in twenty cases, are routed to a smaller predecessor. For most consumer use, that is invisible. For a CIO procuring AI for a regulated workflow — pharma R&D, security operations, defence integration — it matters which version produced which answer, and the audit trail is not yet standardised. Mythos itself remains gated to Project Glasswing partners: cybersecurity teams, government defenders, and a handful of named research labs. The two-tier structure is the new operating reality. Anthropic has stopped pretending its most capable model is for everyone, and it has stopped pretending the safety tax is free.
For DAX40 CIOs already standardised on Claude, the procurement question is no longer whether to upgrade but which tier to buy. Fable 5 at $10/$50 is roughly double Opus 4.8 token-for-token; on long agentic tasks that previously required three Opus runs to converge, internal benchmarks Anthropic shared with launch partners suggest Fable resolves in one. The economics flip in Fable’s favour above a certain task complexity threshold — a calculation that finance teams will need to model rather than assume. Managed Agents at $0.08 per session-hour finally gives procurement a stable line item for autonomous workloads, replacing the unpredictable token bills that have made finance functions allergic to agent pilots. Stripe’s 50-million-line Ruby migration is the reference customer to cite in board decks; the open question is whether regulated industries — pharma, banking, insurance — can run a model with a five-percent silent fallback rate to a smaller predecessor without breaching their model-risk-management frameworks.
Brussels has been watching Mythos since the Preview was withheld in April. The European Commission and at least three member-state regulators, including Ireland, opened consultations with Anthropic over critical-infrastructure exposure before Fable 5 even shipped. Under the EU AI Act’s general-purpose AI provisions, which begin biting for high-risk deployments from August 2026, Fable 5 will almost certainly meet the systemic-risk threshold — capabilities above 10^25 FLOPs trigger the regime, and Mythos-class is several rungs above that line. The two-tier Mythos/Fable structure is itself a regulatory artefact: by routing cyber, bio and chem queries to Opus 4.8, Anthropic can argue it has not made novel offensive capability generally available. Whether that argument survives a German BSI audit, or French ANSSI scrutiny, is the test case the next twelve months will produce. Anthropic has not published an ASL tier for Mythos itself in public documents, a notable gap given its own Responsible Scaling Policy.
The Mythos release reorders the application layer. Startups that have spent the past year fine-tuning Opus or GPT-5 for vertical workflows now face a model that, in Mollick’s testing, completed multi-hour research projects from a single prompt with no scaffolding. The defensive moat shifts from prompt engineering to what Romero calls “loop engineering” — orchestration, sub-agent management, evaluation, and the unglamorous infrastructure of letting a model run for twelve hours without burning $4,000 in tokens. Expect a wave of Series A rounds for agent-observability and cost-control startups. The losers are the thin GPT wrappers whose value proposition was a clean UI on top of a smart model; Fable 5 plus Claude Code plus Managed Agents collapses that stack into one Anthropic SKU. Gary Marcus, predictably, has called the safety framing “a protection racket,” arguing Anthropic walked back the April panic the moment commercialisation became feasible. The capital, regardless, is moving.
Sources 12 references
- [1]Anthropic — Claude Fable 5 and Claude Mythos 5
- [2]Ethan Mollick — What it feels like to work with Mythos (One Useful Thing)
- [3]Alberto Romero — Nine Things About Claude Mythos 5 (The Algorithmic Bridge)
- [4]Lenny Rachitsky — Claude Fable 5 review: what the new Mythos model gets right (and very wrong)
- [5]VentureBeat — Anthropic brings Mythos to the masses with Claude Fable 5
- [6]TechCrunch — Claude Fable 5 is a version of Mythos the public can access today
- [7]CNBC — Anthropic releases Mythos-like AI model to the public, Claude Fable 5
- [8]Gary Marcus — Claude Mythos, evaluated (Marcus on AI)
- [9]CyberScoop — Anthropic's new model is Mythos on a leash
- [10]Silicon Republic — Can EU AI Act actually regulate models like Mythos?
- [11]Finout — Claude Fable 5 and Mythos 5: Pricing, API Costs, and Benchmark Comparison
- [12]InfoQ — Anthropic's Code with Claude Announces Managed Agents