← Back to Stack Digest

Claude Mythos and the Model Anthropic Won't Release

Anthropic says Claude Mythos Preview has already found thousands of high-severity vulnerabilities, outperformed its previous frontier models on cyber benchmarks, and crossed a capability threshold serious enough that the company will not release it generally. Instead, it is ring-fencing access through Project Glasswing.

By Stack Digest 7 min read

Cybersecurity lock interface representing high-stakes defensive AI

Cybersecurity has become the first major proving ground for Anthropic's most powerful unreleased model. Photo: Fly:D / Unsplash

On 7 April 2026, Anthropic announced Project Glasswing, a new cross-industry initiative built around an unreleased model called Claude Mythos Preview. The launch was unusual not because Anthropic claimed the model was stronger than Claude Opus 4.6. That has become routine in frontier AI. It was unusual because the company made the opposite of a standard product announcement: Mythos is powerful enough that Anthropic says it will not make the model generally available.

That decision matters. In consumer AI, the default pattern has been simple: ship the new model, widen access, then worry about downstream effects. Anthropic is making a different argument here. It says Mythos is a general-purpose frontier model whose coding and agentic capabilities have become strong enough that the same system that can autonomously patch complex software can also autonomously find and exploit serious flaws inside it.

"AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities."

This Is Not a Normal Model Launch

Anthropic describes Mythos Preview as a general-purpose, unreleased frontier model. That framing is important. The company is not presenting it as a narrow cyber tool trained specifically to break software. Rather, it says the model's cyber strengths are downstream consequences of broader gains in reasoning, coding, and autonomy. In other words, Mythos appears dangerous not because Anthropic built a hacker model, but because frontier models are now becoming capable enough that offensive cyber behavior emerges from general intelligence improvements.

Project Glasswing is Anthropic's answer to that problem. The initiative brings together Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks, along with more than 40 additional organizations that build or maintain critical software infrastructure. Anthropic has committed up to $100 million in usage credits and $4 million in donations to open-source security organizations to support the effort.

The core idea is simple: if Mythos-class models are going to exist, defenders need access before attackers get equivalent capability by default.

The Numbers Behind the Warning

Anthropic's public benchmark results make clear why the company is treating this as a threshold event. On CyberGym, its vulnerability reproduction benchmark, Mythos Preview scored 83.1 per cent against 66.6 per cent for Claude Opus 4.6. On coding evaluations, the margins were similarly striking: 77.8 per cent on SWE-bench Pro versus 53.4 per cent for Opus 4.6, and 82.0 per cent on Terminal-Bench 2.0 versus 65.4 per cent.

But the more arresting evidence comes from the red-team write-up Anthropic published alongside the announcement. There, the company says Mythos Preview has already found thousands of high-severity vulnerabilities, including flaws in every major operating system and every major web browser. Among the examples Anthropic chose to describe publicly were a 27-year-old OpenBSD bug, a 16-year-old FFmpeg vulnerability, and a chain of Linux kernel bugs that allowed escalation from ordinary user access to complete machine control.

Anthropic also says Mythos Preview is not merely better at spotting bugs. It is significantly better at turning them into usable exploits. In one internal benchmark derived from patched Firefox vulnerabilities, the company says Opus 4.6 produced working JavaScript shell exploits only twice across several hundred attempts, while Mythos Preview did so 181 times and achieved register control 29 more times.

In another internal evaluation across roughly a thousand OSS-Fuzz repositories, Anthropic says Mythos Preview generated 595 crashes at tiers one and two, additional crashes at tiers three and four, and achieved full control-flow hijack on ten fully patched targets. That is the sort of sentence that changes the temperature of the entire AI safety conversation.

Why Anthropic Is Ring-Fencing Access

Anthropic's public position is not that it can prevent Mythos-class capabilities from spreading forever. It is that the transition period matters. The company says it does not plan to make Mythos Preview generally available, and that its eventual goal is to let users deploy Mythos-class systems safely once stronger safeguards are in place. Until then, access is being limited to vetted defenders through Glasswing and a gated research preview.

Commercially, the structure is also revealing. Anthropic says Mythos Preview will be available to Glasswing participants at $25 per million input tokens and $125 per million output tokens, accessible through the Claude API as well as Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry. That is not consumer pricing, and it is not mass-market positioning. It is enterprise-grade scarcity, aimed at organizations with national-security or infrastructure-level stakes in software security.

Anthropic has also signaled that the model is shaping its wider product roadmap. The company says it plans to launch new safeguards with an upcoming Claude Opus model so it can refine safety systems on a model that does not pose the same level of risk as Mythos Preview. That is a notable admission: Anthropic appears to be treating Mythos not as the next product in line, but as the internal benchmark that forces the rest of the Claude stack to harden.

The Bigger Shift

The most important part of the Mythos announcement may be what it implies about the AI market beyond Anthropic. If the company's claims are even directionally correct, cyber capability is no longer a niche subfield of AI progress. It is becoming one of the clearest real-world expressions of frontier model power.

That has two consequences. First, the familiar race to build the most capable model increasingly overlaps with the race to secure critical infrastructure. Second, the central policy question changes. The issue is no longer simply whether a lab can build a model that writes better code than humans. It is whether that lab can keep a model from turning latent bugs in the global software stack into an industrial-scale attack surface.

Anthropic is trying to get ahead of that reality by placing Mythos inside a controlled defensive coalition before releasing anything broader. That may prove prudent, or insufficient, or both. But it is a meaningful break from the industry's usual logic of maximum distribution first and governance later.

In that sense, Claude Mythos Preview is not just another model announcement. It is a warning shot. The frontier is no longer only about who can build the smartest chatbot. It is about who can manage the most dangerous byproducts of general capability once those systems can read, patch, and exploit the software the modern world runs on.

← Back to Homepage Read the OpenAI Story