Researchers outraged by Anthropic’s hidden AI limitations

Anthropic’s powerful new model will intentionally become useless if it detects that you are engaged in AI research, according to a technology disclosure that has already sparked controversy across the industry.

In its Mythos 5 and Fable 5 system cards released Tuesday, Anthropic said the models have limited usefulness for tasks related to developing frontier large-scale language models.

The company said the action stems from concerns that advanced AI systems could accelerate the development of competing models that do not have equivalent safety protections.

Unlike safeguards used for cybersecurity, biological and chemical-related risks, these interventions are intentionally invisible to users, Anthropic said. Rather than denying the request or switching to a different model, Mythos may subtly change its response through techniques such as changing the user prompt.

The move was quickly criticized Tuesday by some AI experts, particularly over the idea that Anthropic had designed the model to intentionally withhold information or provide low-quality assistance without the user’s knowledge.

“Anthropic’s latest model won’t help you if your ML research or ML engineering sounds interesting and/or if it secretly lowers your IQ without the average engineer noticing,” AI research firm SemiAnaracy wrote in X on Tuesday about machine learning, a type of AI.

“We are already seeing moderation of Anthropic’s latest models filtering into our GPU inference research and programming,” the company added.

Elie Bakush, an AI model training expert at startup Prime Intellect, wrote in X that “mythos will be intentionally bad for AI ‘frontier LLM research’ tasks. This is very sad for the research community.” “Also the fact that this is intentionally invisible to the user is crazy.”

Another AI developer wrote, “Not only is it useless, it will lie and give you intentionally bad information.” “An ‘ethical AI’ company that intentionally introduced the most brazenly unethical LLM.”

Mikel Artetxe, co-founder of AI startup Reka, posted that Anthropic’s move is similar to big tech companies interfering with users’ work. “Apple will randomly restart your Mac if you’re developing competing technology, Gmail will silently edit your email if you mention a competing platform, and Tesla Autopilot will change direction if it detects you’re working on self-driving cars.”

Anthropic did not respond to a request for comment from Business Insider.

This adds further fuel to the heated debate over why Anthropic didn’t release the Mythos sooner when they announced this model earlier this year.

There are three main theories:

Formal reason: Anthropic held back Mythos because it was too dangerous and needed to give cybersecurity researchers time to prepare for the new model.
Computing theory: Mythos is a large and expensive model to run. Anthropic didn’t have enough compute to fully release it. Since then, the company has signed new huge computing deals, which may have contributed to Tuesday’s release of Fable 5 and Mythos 5.
Competition theory: AI companies are increasingly concerned about something called distillation. Once a cutting-edge model is released, rivals can collect its output and use the data to improve their own systems. Anthropic may want to keep its best features out of the hands of competitors, especially open source rivals and fast-moving Chinese AI labs, for as long as possible.

Now that Anthropic has incorporated these AI research limitations into the official announcement of Mythos, this third theory has more credence.

Sign up for BI’s Tech Memo newsletter here. Please contact us by email. abarr@businessinsider.com.