When tech companies first deployed their produced products, some critics quickly feared the media collapse. The writing, images and videos have all become suspicious. But for news publishers and journalists, another disaster lies on the horizon.
Chatbots have proven to be skilled at keeping users trapped in conversations. They often do so by answering all the questions by summarizing articles from news publishers. Suddenly fewer people travel outside of generative sites. This is a development that poses an existential threat to the media and to the livelihood of journalists everywhere.
According to one comprehensive study, Google's AI overview (ability to summarise web pages above the site's regular search results reduced traffic to external websites by more than 34%. CEO of publishing Dotdash Meredith people, Better Houses and Gardensand Food & Winerecently said the company is preparing a possibility of a “Google Zero” scenario. We speculate that the decline in traffic resulting from chatbots is part of the reasons. Business Insider and Daily dots I recently had a layoff. “Business Insider Another former staff member told media reporter Oliver Darcy.
Not all publishers are at equal risk. Those that rely primarily on general profitable readers coming from search engines and social media may be in worse condition than specialized publishers with dedicated subscribers. But no one is completely safe. The AI overview, released in May 2024, took part in products powered by ChatGpt, Claude, Grok, Confusion, and other AI-powered products, replacing searches of over 25% of Americans, according to one survey. As my previous reports have shown, businesses train chatbots with a large number of stolen books and articles, scrape news articles, and generate responses with the latest information. Large language models also train rich public domain materials, but many of the most useful to these models are news that lie behind the paywall, especially as users seek real-time information from chatbots. Publishers are creating value, while AI companies are intercepting viewers, subscription fees and advertising revenue.
I asked Humanity, Xai, Prperxity, Google, and Openai about this issue. Humanity and Xai did not respond. Confusion did not directly comment on the issue. Google claimed it was sending “high quality” traffic to the publisher's website. This means that users are said to spend more time on the site if they click, but they refused to provide data to support this claim. Openai has presented an article showing that ChatGpt sends more traffic to its website overall than before, but the raw numbers are rather modest. For example, the BBC reportedly received 118,000 visits from ChatGpt in April, but that's actually nothing compared to hundreds of millions of visitors each month. This article also shows that traffic from ChatGPT is actually declining for some publishers.
Over the past few months I have spoken with several news publishers. Everyone sees AI as a close existential threat to their business. Rich Caccappolo, vice-chairman of media for publishing company Daily Mail– The UK's biggest newspaper by distribution – All publishers have kept me “to see that the overview can unravel the traffic obtained from searches and undermine the key foundational pillars of the digital revelation model. The AI companies claim that chatbots keep their readers going to news publishers, but have not cited any evidence to support this claim. I asked Capcappolo if they think the AI-generated answers could make his company go out of business. “It's absolutely terrifying,” he told me. “And my concern is that it won't happen in three or five years. I'm kidding that it'll happen next Tuesday.”
The chatbots can summarise books and provide a detailed explanation of their content, so book publishers, particularly non-fiction and textbook publishers, also said they were expecting a significant drop in sales. The publisher tried to fight back, but my conversation revealed how much the decks were stacked. The world is changing so rapidly that it is likely irreparable. The institutions that make up our country's free press are fighting for their survival.
Publishers respond in two ways. First: Legal measures. At least 12 lawsuits have been filed against AI companies, including more than 20 publishers. Their outcome is not certain and cases may only be determined after irreparable damage has occurred.
The second answer is to do business with AI companies so that the product can summarise articles and train them on editorial content. Some publishers such as Atlantic Oceanpursues both strategies (the company has a corporate partnership with Openai and sues Cohere). Over the past two years, at least 72 license transactions have been made between publishers and AI companies. But it's not easy to come up with ways to approach these transactions. Caccappolo said, “I felt an incredible imbalance at the negotiation table.” One of the problems is that there is no standard price for training LLMs in books and articles. AI companies know what kind of content they want and already demonstrate their capabilities and willingness without paying, so there is extraordinary leverage when it comes to negotiations. I've learned that books can only be licensed a few hundred dollars each, and publishers who want too much can be turned down just to film the material anyway.
Another problem is that different content appears to have different values for different LLMs. Digital media company Ziff Davis has looked into web-based AI training datasets and observed that content from “prominence” sources, such as major newspapers and magazines, appears to be more desirable for AI companies than blog and social media posts. (Ziff Davis sues Openai for training articles without paying a license fee.) Microsoft researchers have also written publicly about the “importance of high-quality data,” suggesting that textbook-style content may be particularly desirable.
But beyond some specific research like these, there is little insight into what content improves LLM and leaves behind many unanswered questions. Is biography more or less important than history? Is high quality fiction important? Is an old book worth anything? “A solution that can help determine the fair value of specific human-directed content within the active marketplace of LLM training data is extremely beneficial,” said Amy Brand, director and publisher of MIT Press.
The negotiation power of a publisher is also limited by the extent to which AI companies can stop using their work without consent. There is no sure way to prevent AI companies from cutting off news websites. Even the robot exclusion protocol, the standard opt-out method available to news publishers, can be easily avoided. AI companies generally keep their training data secret, making it difficult for publishers to understand which AI companies will sue or enter into contracts, as there is no easy way for publishers to see which chatbots summarise the article. While some experts, such as Tim O'Reilly, suggest that the law requires the disclosure of copyrighted training data, existing laws do not require companies to clarify the specific author or publisher used in AI training materials.
Of course, all this raises questions. It appears that AI companies are already using publisher content. Why pay it now, especially since some of these companies have claimed in court that training LLMs in copyrighted books and articles is being used fairly?
Perhaps the transaction is merely a hedge against a disadvantageous court decision. If AI companies are prevented from training free copyrighted work training, organizations with existing deals with publishers may be ahead of the competition. Publisher transactions are also a way to resolve without litigation. This could be a more desirable path for risk aversion or other uncertain publishers. However, legal scholar James Grimmelman told me that AI companies can also handle complaints like Ziff Davis by arguing that they contain more than training in publishers' content. It can also include access to cleaner versions of articles, continuous access to daily or real-time feeds, or release from Chatbot's Plagiarism liability. Tech companies can argue that the money exchanged in these transactions is exclusively for non-licensing elements and therefore they have not paid for the training materials. For this reason, it is worth noting that tech companies almost always call these transactions partnerships rather than partnerships.
Anyway, modest income from these arrangements is not to save publishers. Even in one thing, one publisher told me that it would not be near to recover lost revenue from a decline in readership. Publishers who can understand how to survive generative attacks may need to invent different business models and find new revenue streams. There may be viable strategies, but none of the publishers I spoke to have a clear idea of what they are.
Publishers have become accustomed to technical threats over the past 20 years. Perhaps Facebook and Google lost their ad revenue losses. However, the rise of generative AI might spell DOOM for a fourth real estate. AI is even the technology industry can take away publishers from viewers.
Some journalists can withstand large extinctions of publishers. The so-called creator economy shows that it is possible to provide high-quality news and information via Substack, YouTube and even Tiktok. However, not all reporters can move to these platforms. Investigative journalism, which uncovers corruption and fraud by powerful people and businesses, poses serious risks of legal consequences and requires resources (such as time and money) that are likely to be scarce for freelancers.
If a news publisher starts going out of business, will AI companies suffer too? Their chatbots need access to journalism to answer questions about the world. Is the high-tech industry interested in the survival of newspapers and magazines?
In fact, there are indications that AI companies believe publishers don't need it anymore. December, New York Times“Dealbook Summit, Openai CEO Sam Altman was asked how writers should feel about the work being used in AI training. “I think that for the way creators get rewards, anything you want to call it, you need a new deal, standard, protocol.” He described the “opt-in” regime, where authors can receive “micropayments” when names, likeness and styles are used. However, this cannot be far from the current practice of Openai, where products are already being used to mimic the style of artists and writers, without compensation or effective opt-out.
Google CEO Sundar Pichai was also asked about author compensation at the Dealbook Summit. He suggested that market solutions would emerge. It probably won't involve publishers in the long run. This is typical. Like other industries, they “disturbed,” Silicon Valley Mogul appears to recognize intermediate facilities as intermediate facilities, in order to be removed for greater efficiency. Uber has captivated drivers to work for it, crushing the traditional taxi industry, and now manages pay, benefits and workloads algorithmically. This, as AI has undoubtedly, meant greater convenience for consumers, but has proven to be catastrophic for many people who once managed to make a living from professional driving. Pichai seemed to imagine a future that could have similar outcomes for journalists. “I think there will be a market in the future. I think there will be creators who will create for AI,” he said. “People will understand that.”