Meta freezes AI data work after breach that puts training secrets at risk

in short: Meta has suspended its collaboration with Mercor, a $10 billion AI data startup, after a supply chain attack exposed not only personal data but also potentially the AI industry’s most closely guarded secrets, including the training methodologies that power the world’s leading large-scale language models. The breach was carried out via a poisoned version of the LiteLLM open source library, sparking investigations at OpenAI and Anthropic, and leading to a class action lawsuit affecting more than 40,000 people.

When hackers contaminated a widely used open source library last month, they didn’t just steal personal data. As Wired reports, they may have walked away with a blueprint for how to build the world’s most powerful AI models.

Meta has suspended its work with Mercor, a San Francisco-based AI data company that generates bespoke training datasets for artificial intelligence giants. That’s because a cyberattack leaked sensitive information about how the company and possibly several other customers actually train their models. The suspension is indefinite, and the incident sent ripples of anxiety through an industry that has spent billions of dollars developing proprietary methods it hopes to keep secret.

Startup behind the curtain

Mercor may not be a household name, but it sits at a critical juncture in the AI economy. Founded in 2023 by Bay Area high school friends Brendan Foudy, Adarsh Hiremath, and Surya Midha, who competed together on the Bellarmine University Prep Speech and Debate team, the company employs a network of human contractors, engineers, lawyers, doctors, bankers, and journalists to generate high-quality, proprietary training data for its AI lab. Its clients include Meta, OpenAI, Anthropic, and Google.

TNW City Coworking Space – Where the best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of technology.

The startup’s rise is astonishing even by Silicon Valley standards. In October 2025, Melkor completed a $350 million Series C round valuing the company at $10 billion, making all three founders the world’s youngest self-made billionaires at the age of 22. By September 2025, the company’s annual sales will reach $500 million, up from $100 million just six months earlier. Its business model, which generates the fine-tuning and reinforcement learning data that AI research institutes rely on but is rarely discussed publicly, has made it one of the most valuable private companies in the AI supply chain.

That same positioning now accounts for its vulnerability.

Poisoned package, series of exposures

The attack that reached Melkor began several steps upstream. A threat actor group known as TeamPCP has compromised CI/CD pipelines for LiteLLM, an open source Python library used by millions of developers to connect applications to AI services, with 97 million monthly downloads and presence in an estimated 36% of cloud environments, according to analysis by Wiz, Snyk, and Datadog Security Labs.

TeamPCP previously used a supply chain attack against Trivy, a widely used security scanner, to obtain credentials belonging to LiteLLM maintainers. On March 27, 2026, the group used these credentials to publish two malicious versions of the LiteLLM package, 1.82.7 and 1.82.8, directly to the Python package repository, PyPI. The contaminated package remained available for approximately 40 minutes before being identified and removed.

The payload was sophisticated. In version 1.82.7, base64-encoded malware was embedded directly into the library’s proxy server code and executed on import. Version 1.82.8 used a malicious path configuration file that was automatically triggered every time a Python process started. Both variants are designed to collect environment variables, API keys, SSH keys, cloud credentials across AWS, Google Cloud, and Azure, Kubernetes configurations, CI/CD secrets, and database credentials, and leak everything to servers at models.litellm.[.]cloud.

Melco acknowledged that it was “one of thousands” of companies affected by the attack, but later discovered that the breach had exposed approximately 4 terabytes of data. The stolen cache includes 939 gigabytes of platform source code, 211 gigabytes of user databases, and approximately 3 terabytes of video interview recordings and identification documents, according to court filings and claims by the hacking groups involved. The information exposed could include the names and social security numbers of more than 40,000 current and former Mercor contractors and customers.

the most important secret

Personal data leakage would be problematic enough. But it’s an entirely different category of information that has Mehta alarmed and the attention of other AI institutes.

Because Mercor is simultaneously in the data pipelines of multiple AI companies, this breach could have exposed details of the data selection criteria, labeling protocols, and training strategies that the companies have spent years and billions of dollars developing. Competitors can duplicate your dataset. Training methodologies are more difficult to replicate and that means a true competitive moat. The Wired report notes that the scale of that potential exposure has prompted multiple AI research institutes to investigate what exactly went off the rails.

OpenAI, which also uses Mercor’s services, said it is investigating the incident but has not suspended any current projects with the company. human, The company raised $3 billion in early 2026 and has been aggressively expanding its research infrastructure, but has not publicly commented on its exposure. Google, which operates similar relationships with rival data vendors, is also understood to be assessing the scope of the breach.

This incident illustrates a structural risk that the AI industry has rarely had to face before. When multiple competitors rely on the same third-party data supplier, a single breach can expose all competitors’ competitive secrets at once.

Blackmail and legal fallout

The threat group Lapsus$, previously responsible for high-profile attacks against major companies, subsequently claimed responsibility for the Mercor breach and began auctioning stolen data on dark web forums. Security researchers believe Lapsus$ is acting in concert with TeamPCP, which has emerged as a systemic threat across the AI and enterprise software ecosystems. The same group is believed to be involved in the incident Wave of supply chain breaches Previous Trivy attacks affected more than 1,000 enterprise SaaS environments, including violations of the European Commission that CERT-EU attributed to the same campaign.

On April 1, 2026, plaintiff Lisa Gil of Wahiawa, Hawaii, filed a class action lawsuit against Mercor.io Corp. in the United States District Court for the Northern District of California. The lawsuit alleges that Melkor failed to maintain adequate cybersecurity protections, exposing more than 40,000 people to identity theft and fraud. The complaint states that the March 27 LiteLLM incident was the entry point, and that Melco’s reliance on compromised open source dependencies without adequate oversight created the conditions for the breach.

Mehta, on the other hand, has said nothing publicly, and his silence speaks louder. The company signed a $27 billion AI infrastructure deal with Nebius Group in March 2026. Additionally, annual capital expenditures are expected to be between $115 billion and $135 billion, making the company’s AI training pipeline one of its most strategically sensitive assets. The decision to suspend a relationship with a data vendor, even a critical one, is only made when the risks to your proprietary methodology outweigh the operational costs of stopping work.

AI supply chain warning

The Mercor breach is, in some ways, a traditional supply chain attack. Attackers discovered weak links in open source dependencies and used them to steal credentials and leak data. In other ways, it is newer and more disturbing. The AI industry has built its most valuable intellectual property on an interconnected web of data vendors, open source tools, and shared infrastructure that now constitutes an attack surface that no single company has complete control over.

Security companies are warning about this very move. Aikido Security, which achieved unicorn status in January 2026, built its business on the premise that open source dependency risks exist in enterprise software. The Mercor incident suggests that the same logic applies, perhaps more acutely, to AI training pipelines.

For the three young founders who built one of the technology industry’s fastest-growing companies, the coming months will test whether Melkor’s extraordinary momentum can withstand a breach that exposed not only users’ data but also their customers’ most carefully guarded secrets. 2025 is a year of drastic changes in the AI industry. was built on the premise that the infrastructure that supports it is secure enough to be trusted. That assumption is currently under consideration.

Source link