Anthropic’s safety warning may have backfired — the government shut down its most powerful AI

The U.S. government on Friday ordered Anthropic to immediately block access to two of its most powerful AI models, Claude Fable 5 and Claude Mythos 5, citing national security concerns. Anthropic announced that it had complied with X, but the government has made clear that it believes this was a mistake.

The directive, which Anthropic announced Friday at 5:21 p.m. ET, forces the company to disable both models for all users around the world, not just foreigners who are nominally targeted by the government’s export control order. Access to Anthropic’s other models is not affected.

Why does this matter? Mythos is Anthropic’s most capable AI model, the one the company previewed in early April, but has since been severely limited by what Anthropic described as its extraordinary ability to discover security vulnerabilities in software. According to Anthropic, Mythos identified the flaw in every major operating system and web browser it tested, so rather than publicly disclosing it, it launched a moderated program called Project Glasswing to share with about 50 vetted organizations, including Amazon, Apple, Google, Microsoft, and CrowdStrike, to use for defensive cybersecurity work.

Released just three days ago, Fable 5 was Anthropic’s answer to obvious commercial pressures. The company claimed it was a version of Mythos with guardrails to block responses in high-risk areas such as cybersecurity and biology, making it secure enough for general release. Benchmark tests from Vals AI, a company that tracks the performance of AI technology, quickly found it to be the highest performing AI model available to the public.

The government directive is part of export control measures and restricts foreigners’ access to the models. However, Anthropic said in a lengthy blog post that it is its understanding that the underlying concern is the alleged Fable 5 jailbreak. So far, the government has provided only verbal evidence of a “potential limited and non-universal jailbreak,” the company said. As Anthropic explains, it prompts the model to read a specific codebase to identify flaws in the software. Incidentally, the company adds that this is a “level of functionality” that is already widely available in other publicly accessible models, such as OpenAI’s GPT-5.5. Anthropic says cybersecurity experts also routinely use it for defensive purposes.

Anthropic’s broader argument is that its strongest safeguards work through an independent classifier system that operates separately from the model itself, meaning that even if someone were to convince Fable to keep talking over the rejection, the fundamental protection against the most dangerous outputs would remain.

Clearly, none of this is enough to stop the government from taking action, and Anthropic has made no secret of its displeasure. “We do not agree that the discovery of a narrow jailbreak possibility should be cause for a recall of a commercial model that has been deployed to hundreds of millions of people,” the company wrote. “If this standard were applied industry-wide, we believe it would effectively halt the rollout of all new models to all Frontier model providers.”

Anthropic is widely expected to seek an IPO this year, betting much of its public identity on being a safety-focused alternative to rivals. The irony is lost on observers that Anthropic’s extreme caution in restricting Mythos, which it promoted as a model too dangerous to release publicly, now appears to be inviting the very government scrutiny that could most disrupt its business.

OpenAI’s Sam Altman must at least be having fun with this. In April, he told podcaster Ashley Vance that Anthropic’s response to Mythos amounted to “fear-based marketing.” “It’s obviously incredible marketing to say, ‘We built a bomb, we were about to drop it on your head, and we’re going to sell you a bomb shelter for $100 million,'” Altman said. Altman, whose company is also widely expected to seek an IPO as soon as possible, did not predict a government shutdown, but noted that so far it has come back to hurt Anthropic. That means that if you spend months telling the world that your AI is uniquely dangerous, the world, including the US government, is more likely to listen.

If you buy through links in our articles, we may earn a small commission. This does not affect editorial independence.

Source link

binance registrering commented on Global Industrial Automation Services Market Size to Reach: Your point of view caught my eye and was very inte
binance commented on WestMetric Defends Controversial On-Page SEO Services for the Era of AI: I don't think the title of your article matches th
创建个人账户 commented on AI in CMO Strategy: Transforming Marketing Leadership: Can you be more specific about the content of your
binance account creation commented on The rise of Artificial Intelligence in Film & TV: Thank you for your sharing. I am worried that I la
最佳gate io推荐代码 commented on Building more cyber-resilient satellites begins with a strong network: Can you be more specific about the content of your

Anthropic’s safety warning may have backfired — the government shut down its most powerful AI

RECENT POSTS

Anthropic disables Fable 5 and Mythos 5 following US export control order

ChatSee.ai raises $6.5 million led by True Ventures to tackle the growing problem of AI agent failure

Only 1 in 4 employees feel fully prepared to use AI

Related Posts