Hiltzik: AI companies face a billion-dollar copyright invoice

AI For Business


Artificial intelligence camps love big numbers. Openai raised in its latest funding round: $40 billion. This year, Meta, Amazon, Alphabet and Microsoft projected investments in AI: $320 billion. Market value of Nvidia Corp., a chip supplier for AI companies: $4.2 trillion.

All of these numbers are taken by AI advocates as verifying the promises and possibilities of new technology. But here is the number pointing in the opposite direction: $1.05 trillion.

If the human AI company decides that six million copyrighted books have been carefully pirated in the process of “training” AI Bot Claude, then it could be on the hook if the ry approvers decide to hit it with a maximum statutory damages of $150,000 per job.

Humanity is at least facing the possibility of business termination liability.

– Edward Lee, Santa Clara University Law School

Edward Lee, an intellectual property law expert at Santa Clara University Law School, is placed like humanity in the “legal battle for its very existence.”

The threat came when US District Judge William Alsap found a copyright infringement lawsuit brought by several published authors against humanity as class actions.

I wrote about this incident last month. At the time, Allsup had rejected plaintiffs' claims of copyright infringement, and it was discovered that humanity would use copyrighted material to develop AI BOT to fall within the scope of a copyright immunity called “fair use.”

However, he also discovered that humanity has downloaded seven million copies of books from the online “Shadow Libraries.”

“We will be tried for pirated copies and the resulting damages,” he ominously advised humanity. He placed the flesh on those bones in subsequent orders and designated the class as the copyright owner of the books he downloaded from Shadow Libraries Libgen and Pilimi. (Some of my own books have been caught up in Books3, another such library, but Books3 is not part of this case and I don't know if my books are in other libraries.)

Class recognition can significantly streamline human litigation. “In lieu of millions of separate cases with millions of ju judges,” Alsup wrote in his original ruling:

Class authentication adds another wrinkle (potentially major) to the ongoing legal dispute over the use of published works to “train” AI systems. This process involves supplying a huge amount of published material. Some of it was scraped off the web. Part of it is drawn from a digitized library containing copyrighted content and materials in the public domain.

The goal is to provide enough data to AI bots so they can collect patterns in language that can be refracted when asking questions.

The authors, musicians and artists have filed many lawsuits claiming that this process infringes copyright because in most cases they have not given permission or are not compensated for use.

One of the latest cases filed last month in federal court in New York by the author, who co-authors of the film “American Prometheus” became the approved source for the film “Oppenheimer,” accusing Microsoft of downloading “about 200,000 Pirate Books” from Books3 to train their own AI bot, Megatron.

Like many other copyright cases, Byrd and his fellow plaintiffs allege that the company trained Megatron with works in the public domain or obtained it under license. “But either of them will take longer and cost more than the options Microsoft has chosen,” the plaintiff said. Training bots “without permission and compensation as if there were no laws protecting copyrighted works.”

I asked Microsoft for a response, but I haven't received a reply.

Among the judges who have contemplated the issue, Tide appears to be building in favor of viewing the training process as fair use. In fact, Alsup himself came to that conclusion in the case of humanity and determined that the use of materials downloaded for AI training was fair use, but he also heard evidence that humanity had kept materials downloaded for other purposes. It is not a fair use, he discovers, and humanity is exposed to accusations of copyright infringement.

Not only was Alsup's ruling unusual, it was also “soromonic.” His discovery of fair use led to a “partial victory” for humanity, but his discovery of possible copyright infringement placed humanity in “a very difficult place,” says Lee. This is because financial penalties for copyright infringement could be Gargantuan. This ranges from $750 to $150,000 per work. This is when a ju apprentice discovers that the user is engaged in a deliberate infringement.

According to the lawsuit filed, as many as 7 million works may have been downloaded by humanity, but the undecided number of these works may have been replicated in two shadow libraries used by the company, and between copyrighted works that the company actually paid for. The number of works is unknown until at least September 1st, until the Allsup deadline provides the plaintiff with a list of all infringed works downloaded from Shadow Libraries.

Deducting duplicates, if the total of each individual compromised work totals 7 million, the $150,000 invoice per work totals $1.05 trillion. It will become a swamp for humanity financially. The company's annual revenue is estimated to be around $3 billion, and it is estimated to be around $100 billion in private market value.

“In fact,” Lee wrote on his blog that “ChatGpt is eating the world.” Class certification writes, “Artificial face means at least the possibility of business termination liability.”

Humanity did not reply to requests for comment on the outlook. However, in a motion to send Allsup his ruling to the Ninth Circuit Court of Appeals or reconsider its own findings, the company pointed to the blow that his position reached the AI industry.

If his position was widely adopted, humanity said, “training by companies that downloaded works from third-party websites such as Libgen and Books3 could constitute copyright infringement.”

It was an implicit recognition that the use of Shadow Libraries was widespread in AI camps, but it was also a suggestion that AI companies that used them should not be punished, as it was the shadow libraries that committed copyright infringement.

Humanity also pointed out in its allegations that the plaintiff in that case did not raise the issue of copyright infringement. ALSUP came up with it himself by treating AI bot training and research libraries as two separate uses. The former is permitted under fair use. It deprived the court of the opportunity to respond to the theory.

The company observed that a fellow federal judge of Vince Chhabria, San Francisco court for Alsup, reached a contradictory conclusion just two days after Alsup, exempting meta-platforms of copyright infringement claims on similar facts, based on fair use exemptions.

ALSUP class certification could hit both plaintiff and defendant camps in an ongoing dispute regarding AI development. Plaintiffs who have not filed a claim for copyright infringement in the lawsuit may be encouraged to add it. The defendant is subject to great pressure to prevent lawsuits from occurring by running around to reach a license deal with writers, musicians and artists. This happens especially when another judge accepts Alsup's arguments about copyright infringement. “It might encourage other litigation,” says Lee.

For humanity, the challenge is to “try to convince the ju judge that the damages award should be $750 per job,” says Lee. The Allsup ruling has earned this claim a class recognition, one of the rare cases where “plaintiffs have the upper hand.” “All these companies put great pressure on negotiating settlements with the plaintiffs. Otherwise they are at the mercy of the ju umpire and cannot be made into a bank in terms of what the ju umpire will do.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *