AI will learn what you say on Reddit, Stack Overflow, and Facebook. Is that a good thing? – News-Herald

Matt O'Brien

CAMBRIDGE, Mass. — By posting a comment on Reddit, answering a coding question on Stack Overflow, editing a Wikipedia entry or even sharing a baby photo on your public Facebook or Instagram feed, you can also help train the next generation of artificial intelligence.

Not everyone is convinced, especially as the same online forums where they've been posting for years are increasingly flooded with AI-generated comments that mimic what real humans would say.

Some longtime users have tried to delete old posts or rewrite them into gibberish, but the protests have had little effect. Several governments have also tried to intervene, including Brazil's privacy regulator on Tuesday.

“A significant percentage of the population feels powerless,” said Sara Gilbert, a volunteer Reddit moderator who also studies online communities at Cornell University. “They have nowhere to go except to go completely offline or to not contribute in a way that brings value to themselves or others.”

Platforms are responding, but with mixed results. Take Stack Overflow, a popular site for computer programming tips: The platform first banned replies written in ChatGPT due to frequent errors, but is now partnering with the developer of the AI chatbot to punish some users who tried to delete old posts in protest.

It's one of many social media platforms grappling with user wariness — and occasional rebellion — as they try to adapt to changes brought about by generative AI.

Andy Loeterling, a software developer from Bloomington, Minnesota, said he has used Stack Overflow daily for 15 years and is concerned the company may be “inadvertently hurting one of its greatest resources: the community of contributors who donate their time to help other programmers.”

“It's paramount that we continue to incentivize contributors to provide comments,” he said.

Stack Overflow CEO Prashant Chandrasekhar said the company is trying to balance the growing demand for instant coding assistance from chatbots with the desire for a community “knowledge base” where people want to be “recognized” for the contributions they make.

“Five years from now, there's going to be all kinds of machine-generated content on the web,” he said in an interview. “There's going to be very few places where you can find truly authentic, original human thought, and we're one of those places.”

Chandrasekar describes Stack Overflow's challenge as similar to one of the “case studies” he learned about at Harvard Business School — a case study of how companies will or won't survive after a disruptive technological change.

For over a decade, users would typically type their coding question into Google, then visit Stack Overflow to find the answer they wanted to copy and paste. The answers they saw were likely from volunteers who had accumulated credibility points that could sometimes help them land a job.

Now, programmers can simply ask questions to AI chatbots, some of which have already been trained on everything posted on Stack Overflow and can instantly return answers.

ChatGPT's debut in late 2022 pushed Stack Overflow to the brink of bankruptcy, so Chandrasekhar assembled a 40-person task force at the company to fast-track the launch of its own AI chatbot, called Overflow AI. The company then struck a deal with Google and ChatGPT developer OpenAI to allow AI developers to tap into Stack Overflow's archive of questions and answers to further improve their AI large-scale language models.

Maria Roche, an assistant professor at Harvard Business School, said the strategy makes sense but may be too late: “I'm surprised Stack Overflow didn't do this sooner,” she said.

After the partnership with Open AI was announced, some Stack Overflow users tried to delete their old comments, and the company responded by terminating their accounts, with the condition that they “permanently and irrevocably license” all posts to Stack Overflow.

“We responded immediately and said, 'This is unacceptable behavior,'” Chandrasekhar said, adding that those protesting were a tiny minority of “a few hundred” of the platform's 100 million users.

Brazil's national data protection authority on Tuesday banned social media giant Meta Platforms from training AI models on Brazilians' Facebook and Instagram posts, with the authority imposing fines of $8,820 per day for violations.

In a statement, Meta called it a “setback to innovation” and said it was more transparent than industry peers who train similar AI on public content and that its practices comply with Brazilian law.

Meta has also faced resistance in Europe, where it recently put on hold plans to begin training its AI systems with people's public posts, which were due to begin last week. In the United States, which has no national laws protecting online privacy, such training is likely already happening.

“The vast majority of people have no idea their data is being used,” Gilbert said.

Reddit has taken a different approach, partnering with AI developers like OpenAI and Google while making clear that commercial entities cannot exfiltrate content en masse without the platform's approval “without regard for users' rights or privacy.” The deal gave Reddit the capital it needed to go public on Wall Street in March, with investors valuing the company at nearly $9 billion seconds after it began trading on the New York Stock Exchange.

Reddit hasn't tried to punish users for speaking out, and it wouldn't be easy to do so given that volunteer moderators have a big say over what happens in the specialized forums called subreddits. But Gilbert, who moderates the “AskHistorians” subreddit, worries about the rise of AI-generated comments that moderators must decide whether to allow or ban.

“People come to Reddit because they want to talk to humans, not to talk to bots,” Gilbert said. “If you want to talk to a bot, there are apps. But historically, Reddit has been a place to connect with humans.”

She said it was ironic that the AI-generated content threatening Reddit was based on comments from millions of human Reddit users, and that “there's a real risk that it's going to end up pushing people out.”

——

Associated Press writer Eleonore Hughes in Rio de Janeiro contributed to this report.

——

The Associated Press and OpenAI have a licensing and technology agreement that gives OpenAI access to portions of the AP's text archive.

Source link