Reportedly
Content creators are now alleging that their videos have been used without their permission by tech giants including Apple, Nvidia and Anthropic to train artificial intelligence (AI) systems.
In response to the lawsuit, defendants including Meta, OpenAI, and Bloomberg have argued that their actions constitute fair use. Interestingly, the plaintiffs voluntarily dropped their lawsuit against EleutherAI, the organization that originally scraped and published the books.
However, the remaining lawsuits are still in their early stages, with questions about permissions and compensation remaining unresolved. The Pile dataset has been removed from the official download site, but it can still be accessed through file-sharing services.
“Tech companies have acted with impunity, and people are right to be concerned that they don't have a say in this issue. That's the crux of the issue,” said Amy Keller, a consumer protection attorney and partner at DiCello Levitt.
Creators now face an uncertain future: Professional YouTubers diligently monitor unauthorized use of their content, frequently issuing takedown notices, but there are growing fears that in the near future, AI could generate content similar to theirs, or even copy it outright.
David Pacman, creator of “The David Pacman Show,” recently encountered the power of AI while scrolling through TikTok. He stumbled across a video labeled as a Tucker Carlson clip, but upon closer inspection, he saw it was a replica of something Pacman himself said on his YouTube show. The voice, the words, everything was the same. Pacman expressed concern that only one of the commenters on the video realized that Carlson was a clone of the voice that read Pacman's script.
“This is going to be a problem,” Pakman warned in a YouTube video addressing the issue. “Basically, anyone can copy it.”
EleutherAI co-founder Sid Black revealed that he created “YouTube subtitles” using a script that downloads subtitles from YouTube's API, just as a viewer's browser does when watching a video.
Black's search terms (495 in total) included phrases such as “funny blogger,” “Einstein,” “black Protestant,” “social protection services,” “information wars,” “quantum chromodynamics,” “Ben Shapiro,” “Uighur,” “frutarian,” “cake recipes,” “Nazca Lines” and “flat Earth.”
YouTube's terms of service prohibit “automated means” from accessing the video, but more than 2,000 GitHub users have bookmarked or endorsed Black's code.
Jonas DePova, a machine learning engineer who posted the code on GitHub, noted that “if YouTube wanted to, there was a way to prevent this module from working,” but that hasn't happened.
Google spokesman Jack Maron said the company has taken steps to prevent unauthorized scraping for years, but was silent about other companies using such material to train their AI.
The AI company has adopted over 146 videos from the channel “Einstein Parrot,” which boasts around 150,000 subscribers.
The parrot's caretaker, Marsha (who asked that her last name not be revealed for the bird's safety), found it amusing that the AI model had absorbed the words that imitated her avian friend.