GPTZero, a detector of AI output, has once again discovered that scientists are undermining their credibility by relying on unreliable AI assistance.
The New York-based business has identified 100 hallucinations in more than 51 papers accepted by the Conference on Neural Information Processing Systems (NeurIPS). This finding follows the company’s previous finding that there were 50 psychedelic citations in papers under review by the International Conference on Learning and Representations (ICLR).
In a blog post, GPTZero’s senior machine learning engineer Nazar Shumatko, head of machine learning Alex Adam, and academic writing editor Paul Esau argue that the use of generative AI tools has fueled a “tsunami of AI slop.”
“Between 2020 and 2025, applications to NeurIPS increased by more than 220%, from 9,467 to 21,575,” they observe. “In response, organizers have had to recruit more reviewers than ever before, resulting in problems with oversight, coordination of expertise, negligence, and even fraud.”
These hallucinations primarily consist of authors and sources invented by generative AI models, as well as texts purportedly created by AI.
The legal community has grappled with similar issues. More than 800 incorrect legal citations attributed to AI models have been noted in various court filings, often impacting the lawyers, judges, or plaintiffs involved.
Although academics may not be subject to the same misconduct sanctions as legal professionals, the consequences of carelessly applying AI could have implications beyond the waste of integrity.
The surge in AI paper submissions coincides with an increase in the number of serious errors in academic writing, such as incorrect formulas, miscalculations, and numerical errors, as opposed to citing non-existent source material.
A preprint paper published in December 2025 by researchers from Together AI, NEC Labs America, Rutgers University, and Stanford University examines AI from three major machine learning organizations: ICLR (2018-2025), NeurIPS (2021-2025), and TMLR (Transactions on Machine Learning Research) (2022-2025). I paid particular attention to the paper.
The authors found that “published papers contain a non-negligible number of objective errors, and the average number of errors per paper increases over time: from 3.8 in NeurIPS 2021 to 5.9 in NeurIPS 2025 (a 55.3 percent increase), from 4.1 in ICLR 2018 to 5.2 in ICLR 2025, and in TMLR 5.0 from 2022/23 in TMLR 2025.
Correlation is not causation, but the rapid adoption of generative AI tools cannot be ignored, given that error rates in NeurIPS papers increased by 55.3 percent after the introduction of OpenAI’s ChatGPT. For scientists, reputational risks are not the only risks of unchecked AI use. Their work may be invalidated.
A NeurIPS spokesperson did not immediately respond to a request for comment. We will update this story if we hear back after publication.
GPTZero argues that its Hallucination Check software should be part of a publisher’s AI detection tools. While this may be helpful when trying to determine whether a citation refers to actual research, there are also measures that some claim can make AI authorship more difficult to detect. For example, a Claude Code skill called Humanizer says it “removes signs of AI-generated writing from text, making it sound more natural and human.” There are many other forensic options available.
A recent report from the International Society of Scientific, Technical and Medical Publishers (STM) seeks to address the integrity challenges facing the academic community. According to the report, the volume of scholarly communications will increase from 3.9 million five years ago to 5.7 million in 2024. They argue that publishing practices and policies need to adapt to the reality of AI-assisted and AI-fabricated research.
“Academic publishers are certainly aware of this issue and are taking steps to protect themselves,” Adam Marcus, co-founder of Retraction Watch, which has documented many AI-related retractions, and editor-in-chief of Gastroenterology & Endoscopy News, said in an email. register. “Whether they will be successful remains to be seen.
“We are in an AI arms race, and it is unclear whether the defenders will be able to withstand the siege. But it is also important to recognize that publishers are making themselves vulnerable to these attacks by adopting business models that prioritize quantity over quality. They are far from innocent victims.” ®
