The group of authors accused Microsoft of using nearly 200,000 pirated copies to create an artificial intelligence model. This is the latest allegation in a long legal battle over copyrighted works between creative professionals and technology companies.
Kai Bird, Jia Tolentino, Daniel Okrent and others claim that Microsoft will use the well-known digital version of the book to teach Megatron AI and respond to human prompts. Their case, filed Tuesday in federal court in New York, is one of several high-stakes cases brought by authors, news outlets and other copyright holders against authors, news outlets and other tech companies regarding alleged misuse in AI training.
The author requested a court order blocking statutory damages of up to $150,000 for each work that is allegedly misused by Microsoft.
Generic artificial intelligence products such as Megatron create text, music, images and videos according to user prompts. To create these models, software engineers accumulate a huge database of media and program AI to generate similar output.
The writer allegedly claimed that Microsoft would use a collection of nearly 200,000 pirated books to train Megatron, an AI product that provides text responses to user prompts. According to the complaint, Microsoft used pirated datasets to “construct not only to computer models built on the works of thousands of creators and authors, but also to generate a wide range of representations that mimic the syntax, sound and themes of trained, copyrighted works.”
A Microsoft spokesperson did not immediately respond to requests for comment regarding the lawsuit. The author's lawyer declined to comment.
The complaint against Microsoft came the day after a federal judge in California determined that humanity was used fairly under the copyright laws of the author's material to train AI systems, but they may still be responsible for the pirated version of their books. This was the US's first decision regarding the legality of using copyrighted materials without allowing generation AI training. On the day the complaint against Microsoft was filed, a California judge supported Meta in a similar dispute over the use of copyrighted books used to train AI models, but he attributed his ruling more to the plaintiff's poor argument than the strength of the tech giant's defense.
The legal battle over copyright and AI began soon after ChatGpt's debut and encompasses several different types of media. The New York Times sued Openai for copyright infringement over the archives of the article. Dow Jones, the parent company of the Wall Street Journal and the New York Post, filed a similar lawsuit against the bewildered AI. The main record labels are suing companies that make music generators with AI. Photography Company's Getty Images has filed a lawsuit against Stability AI for a startup's text-to-image product. Last week, Disney and NBC Universal sued Midjourney, which offers popular AI image generators suspected of misuse of some of the world's most famous films and television characters.
Tech companies claim that if they are forced to use copyrighted materials fairly to create new, transformative content and pay copyright owners for their work, they could attack the burgeoning AI industry. Openai CEO Sam Altman said the creation of ChatGpt was “impossible” without using copyrighted works.
