Anthropic changes policy on hidden AI restrictions after developer backlash

Machine Learning


Anthropic has abruptly reversed a controversial, unannounced policy that degraded the performance of its newest model, the Claude Fable 5.

Wired reports that the reversal comes after intense backlash from the machine learning community, which criticized the company for its lack of transparency and anti-competitive behavior.

The controversy began earlier this week with the release of Claude Fable 5, a version of Anthropic’s highly sophisticated Mythos system with special national security guardrails. The company openly stated that it would transfer risky prompts related to cybersecurity, biology, and chemistry to less advanced models, but did not reveal another limitation: silently suppressing requests related to Frontier LLM development.

AI researchers quickly realized that when Fable 5 was tasked with training competing LLMs, debugging AI code, or optimizing neural architectures, the model would silently fail or reduce output without notifying the user. This hidden mechanism quickly came under fire from developers who complained that they were burning expensive API tokens on an intentionally malfunctioning system.

Many in the technology industry viewed the stealth restrictions as a hostile ploy aimed at preventing competitors from using Anthropic’s proprietary data to build competing systems. The move significantly hurt Anthropic’s public image, as the company has long positioned itself as a more ethical, researcher-friendly, and safety-focused alternative to competitors like OpenAI.

“Degraded performance in ML research” without telling the user It’s an incredibly hostile and awful look,” noted researcher Dean W. Ball said on social media platform X.

In a statement to Wired and Business Insider, Anthropic acknowledged the error and announced immediate changes to make the guardrails fully visible.

“We plan to change Fable 5’s safeguards for Frontier LLM development to make it more visible,” said an Anthropic spokesperson. “We made the wrong trade-off, and we apologize for not striking the right balance.”

Under the revised policy, safety restrictions still apply, but stealth elements have been removed. Starting this week, flagged API requests will explicitly return a reason for rejection, and standard user queries will visibly fall back to the older Claude Opus 4.8 model, rather than being silently degraded.

Anthrop defended the core of the safeguards, noting that the underlying Mythos framework has unprecedented capabilities in advanced reasoning, cyber operations, and scientific research. Strict barriers are intended to prevent foreign adversaries from accelerating weapons research or gaining dangerous technological advantages.

The company reassured developers that the majority of standard coding and machine learning applications are completely unaffected by these national security parameters. While the entire Mythos system remains restricted to vetted government agencies, the public-facing Fable 5 will now operate with the transparency demanded by the developer community.



Source link