Research shows that around 50,000 Spanish works are being used to illegally train AI. Photo Credit: Emiliano Vittoriosi by Unsplash
In addition to the globalization of AI, there are many new issues and situations where individuals need to learn how to navigate when text, images, videos, and music can be generated from anywhere, when they influence concepts such as copyright law, originality, and primitive systems. In particular, AI must learn from human media, language models and texts to continue developing and improving. That means you need to train. However, in Spain, recent research has found that books are used to train AI (such as illegally pirated books).
According to a representative of Cedro, Spain's intellectual property management entity, Cedro has found that approximately 50,000 books and works related to at least 41,000 authors and 1,100 different publishers have been used to train AI.
The claim is based on a report released by the Danish Alliance of Rights in September 2024, which states that Libgen, a Russian pirated website, is training AI models such as Openai (also known as ChatGpt) and the pirated Spanish meta. Libgen was ordered to close, and its domain was hung in most of Europe, but it is still accessible through mirror sites and VPNs. This allows users to access it by appearing to be in another country. This data was collected by compiling published documents and statements during the legal battle against well-known high-tech companies in the United States. Therefore, companies that provided services using Libgen may be committed to copyright infringement.
Among the authors influenced are Almudena Grandes, Almudena Perez Liberte, Fernando Arumble, Dolores Redondo, Lorenzo Silva, Maria Dunas and Eduardo Mendoza. Some of the Spanish publishers affected are Grupo Planeta, Acantilado, Anagrama, Libros del Asteroide, and parts of The Rae.
“They have been stealing from us for a long time,” said Daniel Fernandez, president of the Spanish Editorial Guild Federation. Fernández adds that AI integration can be dangerous, especially for the publishing industry. This is because if no regulations are in place, they can take over works that were created by humans and were generated by AI. Cedro director Jorge Corrales called the government's current regulations on AI “a complete flaw,” adding, “I don't think the law is as fast as technology. There's no lack of attempts here.”
Nevertheless, the government, particularly the EU, has taken several steps towards Parliament in light of the globalization of AI. Last May, the first AI regulation law was passed, and perhaps more will come. However, rather than turning a blind eye to the international use of AI, it is essential to find a long-term solution that benefits publishers and other creatives in order to manage copyright licenses.
Cedro spokesman Carmen Cuartero said the government is responsible for providing authors and publishers with tools and resources to manage their own intellectual property and avoid the so-called “copying culture” that appears to continue growing until that day.
Tags: AI, copyright infringement
