Anthropic expands public access to Claude Mythos AI models

Machine Learning


The future of artificial intelligence and machine learning, next generation technology and secure development, AI and cybersecurity

We expect Mythos-level models to be widely available within 6-12 months

Matthew J. Schwartz (euro infosec) •
May 26, 2026

Anthropic expands public access to Claude Mythos AI models
Image: Shutterstock

More companies will now have access to Claude Mythos, a bug-hunting artificial intelligence model that the company touted as too dangerous to release to the public.

See also: How AI increases the risk of corporate data breaches

Anthropic restricts access to the Mythos large-scale language model through Project Glasswing. The project includes approximately 50 carefully selected partners, including technology giants Cisco, Oracle, and Microsoft (see below). Anthropic claims new model is too dangerous to release).

“We are working with key partners, including the U.S. and allied governments, to expand Project Glasswing to more partners, and look forward to making the Mythos-class model available through general release once we develop the much stronger protections needed in the near future,” the company said Friday.

What makes Mythos notable is that LLM not only discovers software vulnerabilities at an unprecedented level, but in some cases can combine lower-severity flaws with higher-severity threats to provide a working exploit chain. These capabilities have allowed Anthropic to restrict access and give some of the world’s largest software vendors and open source projects a head start on finding and patching flaws before threat actors gain the ability to do the same, such as through other emerging LLMs (see below). Zero Day for the Masses: Foreshadowing of the Tsunami Myth).

The company said it was only a matter of time before attackers had comparable tools. “We believe Mythos-level models will be widely available within the next 6-12 months,” Anthropic said Friday.

Few flaws discovered in Mythos are not fully detailed publicly, but even then they are usually only credited in the new CVE release notes.

“The disclosed vulnerabilities are a lagging indicator that the frontier of cyber capabilities for AI models is accelerating. We are not yet at the point where we can use Mythos Preview to fully detail our partner findings without putting end users at risk,” Anthropic said.

Anthropic said its partner organizations have collectively discovered “more than 10,000 high-severity or high-severity vulnerabilities across the world’s most systemically critical software.” Of these flaws, 6,202 were present in open source software.

Anthropic said a third-party review of 1,752 vulnerabilities with a high or critical rating found that 91% were verified to be flawed, and two-thirds of them were found to have the correct severity rating.

Currently, the real-world threat posed by Mythos is still unknown. The UK’s Institute for AI Security reported last month that the performance of Frontier models, including Mythos, is improving rapidly. “In supervised evaluations where Mythos Preview was explicitly directed and given network access to do so, we observed that Mythos Preview was able to perform multi-stage attacks against vulnerable networks and autonomously discover and exploit vulnerabilities, a task that would take a human expert several days,” the institute found.

However, the institute did not simulate a typical corporate environment with extensive cybersecurity defenses or active defenders. “This means we cannot say with certainty whether Mythos Preview is capable of attacking well-defended systems,” the institute reported.

new benchmark

The direction seems clear. The ability to exploit Frontier LLM vulnerabilities and orchestrate attacks is rapidly improving.

New benchmarks are being developed by academics to measure the ability of LLMs to exploit vulnerabilities. These include Carnegie Mellon University and BugCrowd’s ExploitBench, and ExploitGym, developed by the University of California, Berkeley, the Max Planck Institute for Security and Privacy, the University of California, Santa Barbara, and Arizona State University, with input from security researchers at Anthropic, Google, and OpenAI.

Glasswing participant Cloudflare said that one of the great achievements of Mythos so far is that treating the model as a vulnerability discovery harness rather than a “chat interface” has led to much better results, as well as working in more bite-sized pieces.

The company’s approach is to issue very specific instructions, use a second agent to check the work of the first agent, and ask many different agents parallel questions about different parts of the attack chain.

“Rather than asking a single agent exhaustive questions, coverage is improved when many agents work on a narrow range of questions and later deduplicate results,” Cloudflare CSO Grant Bourzikas said in a May 18 blog post.

Anthropic said the new tools it will provide vetted security teams will include “custom instructions for repetitive tasks” like this developed with Project Glasswing participants, as well as “harnesses that help Claude map the codebase, launch scanning subagents, prioritize results, and generate reports.” It also includes a “threat model builder” that scans the codebase to identify likely attack targets, helping researchers prioritize their efforts.

Cloudflare said one of the benefits of working with Mythos is a “clear improvement” in LLM’s ability to chain vulnerabilities to proof-of-concept exploits, which reduces the time needed to remediate flaws. “The findings from a PoC are actionable findings, which means we spend a lot less time asking, ‘Is this really true?’” Bourzikas says.

Big questions arise. Code maintainers have already reported a flood of bug reports with extensive overlap thanks to AI-assisted vulnerability researchers using the same LLM.

Microsoft-owned GitHub tweaked its bug bounty program earlier this month in response to a “surge in posts that demonstrate no real-world security impact” and too often do not include proof-of-concept exploits. “This is not unique to GitHub; programs across the industry are grappling with the same challenges, and some have stopped altogether,” the company said.

The company says it is seeing an increase in “legitimate reports” that are still offering cash rewards to address “submissions that have not been shown to have a significant security impact but result in code or documentation modifications,” but future rewards will only include “GitHub goods.”

Another challenge is speed. As more vulnerabilities are discovered and reported and software developers issue more patches, end-user organizations must test and deploy these fixes. Vulnerability management experts said there are theoretical limits to how quickly this can be done, at least in an effective manner.

Given these challenges, Cloudflare said that “what the architecture around the vulnerability should look like” remains an open question.

“The principle is that even if a bug exists, it should be difficult for attackers to exploit, so that the gap between when a vulnerability is disclosed and when it is patched becomes less significant,” Bourzikas said. What that will actually look like is still being developed.





Source link