Agentic AI, artificial intelligence and machine learning, governance and risk management
Anthropic myth leak points to pattern of failure and sloppy practices at AI lab
Rashmi Ramesh (Rashmila Mesh_) •
April 1, 2026

Anthropic has spent years establishing itself as the company that cares most seriously about what’s at stake in artificial intelligence. But over the past two weeks, the AI company accidentally announced its own new product and then handed detailed blueprints for its most widely deployed tool to a competitor.
See also: AI impersonation is the new arms race—are your employees ready?
Neither incident involved an adversary. The first was a misconfigured content management system, and the second was a debug file that wasn’t meant to be shipped. In Anthropic’s own words, both failures were human error.
What makes this a bad development for the broader AI sector is that it fits the pattern over the past three years. Meta’s Llama model escaped restricted research release within a week of being shared with approved academics. Microsoft’s AI research team exposed 38 terabytes of internal data through a set of cloud credentials with incorrect permissions. OpenAI acknowledges that types of attacks that manipulate AI agents through input may not be completely contained. The mechanism is different each time, but the structure is different. Failures occur within the organization, not at the perimeter, and organizations relying on these tools have no advance notice.
This raises the standard vendor due diligence question: “Has this vendor been compromised?” But that doesn’t paint the whole picture. None of these incidents required a breach. Data has slipped through the cracks and been left behind by companies that should have been caught by day-to-day operational management. A more useful question for security or technology leaders evaluating AI platforms is whether the vendor’s internal development and data processing processes are mature enough for what they are building and the scale on which their enterprise relies.
Man-made incidents arrived one after another. Security researchers Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge discovered a draft blog post announcing the unreleased model last Thursday in a publicly searchable data cache, along with an internal PDF and an itinerary for an invite-only executive retreat. Anthropic’s content management system made uploaded files public by default, and only explicit manual changes could keep them private.
Anthropic reportedly restricted access and acknowledged that “draft content was accessible due to an issue with one of our external CMS tools,” and attributed the cause to “human error.” Nearly 3,000 undisclosed assets were exposed. Anthropic separately confirmed to Fortune that the model, internally known as Mythos, is a “step change” in AI performance and is “the most capable we’ve ever built.”
Security researcher Chaofan Shou on Tuesday discovered an anomaly in the regular updates of Claude code distributed through npm, a platform used by developers to share and update software packages. This update included source map files, which are debugging artifacts that link compressed product code to the original readable source. That file pointed to a zip archive on Anthropic’s own cloud storage that contained the complete codebase of Claude’s code (approximately 512,000 lines across 1,900 files). Shou posted a link to X. Within hours, the code was mirrored across GitHub and forked tens of thousands of times. Anthropic calls this “a release package issue caused by human error, not a security breach.”
“It only takes one misconfigured .npmignore or files field in package.json to expose everything,” said Gabriel Anheia, a software engineer who analyzed the breach. The .npmignore file is a list that tells packaging tools what to exclude from public release. If the entry is missing, it means that debug files are shipped with the product. This is the type of check that is part of the pre-release audit checklist, but it appears that it was not done this time.
Paz, whose company first discovered the Mythos documentation, also evaluated the source code. “Typically, large companies have rigorous processes and multiple checks before code reaches production, like a safe that requires several keys to open,” he told Fortune. “At Anthropic, it appears that this process was not followed properly, and with a single misconfiguration or click, the complete source code was suddenly exposed.” His concern was that not only would competitors have access to the code, but the leaked files could also reveal non-public details about how Anthropic’s systems work, which could help attackers look for weaknesses. For organizations integrating Claude Code into their development pipelines, this is a supply chain reliability issue.
Developer Alex Kim, who released a breakdown of the code leak on the day it occurred, cited the argument that the leak was minor because Google and OpenAI’s competing tools are already open source. “These companies open sourced their agent SDKs rather than fully internally wiring their flagship products,” he said. “The real damage is not the code. It’s the feature flags. KAIROS, the anti-distillation mechanism: These are product roadmap details that competitors can see and react to. The code can be refactored. Strategic surprises can’t be divulged.” Essentially, the internal privilege enforcement logic, agent orchestration design, and system prompts that control Claude Code’s behavior within the environment were exposed.
Not just an outlier
Anthropic’s week is the latest entry in this pattern and is not an outlier. In February 2023, after Meta released the Llama language model to a limited group of authorized researchers under a non-commercial license, a torrent of the model appeared on 4chan within a week and was posted by someone who had been granted access. Meta filed a takedown request, but the model went viral anyway. As Vice reported at the time, this was the first time a major tech company’s proprietary AI model had been leaked to the public, not because of an intrusion, but because an authorized participant shared the file. Access control model failed.
Six months later, cloud security company Wiz revealed that Microsoft’s AI research team had leaked 38 TB of internal data while posting open source training data to GitHub. The team shared access using Azure cloud storage credentials. This is a URL-based token that, if configured correctly, restricts access to specific folders. This mistake granted Full Control permissions to the entire storage account, which was scheduled to expire in 2051. The leaked data included employee workstation backups, internal passwords, private keys, and over 30,000 internal Teams messages. Wiz reported this to Microsoft in June 2023. Microsoft revoked the credentials two days later and said no customer data had been compromised. This revelation was live and publicly accessible for several months before it was discovered by outside researchers. Microsoft’s own monitoring failed to catch it.
In December 2025, OpenAI acknowledged that attacks that embed malicious instructions in content processed by AI agents and disable what the agents were configured to do are a threat that cannot be completely eliminated. “Like fraud and social engineering on the web, instant injection is unlikely to be completely resolved,” the company wrote. For companies running AI agents that interact with emails, documents, or web pages, this is today’s live exposure in production and is recognized by vendors as persistent.
Each of these incidents has a clear technical explanation, such as CMS defaults, packaging oversights, overly permissive credentials, and architectural characteristics of how the language model processes instructions. When shot individually, each one looks like a single shot. But over a three-year period, the world’s four most prominent AI companies demonstrate a consistent gap between the speed at which they build and ship AI products and the maturity of the processes surrounding their development efforts.
When security or technology leaders decide which AI vendor to trust with core enterprise workflows, the key question is not whether that vendor has been compromised. It’s whether the vendor’s own pipeline reflects the same standard of care they expect from their customers.
