Backlash against the use of hidden prompts to trap AI peer reviews

Applications of AI


The organizer of a prominent neuroscience conference is facing a backlash on social media after adding hidden prompts to papers to catch reviewers using generative artificial intelligence (AI) to review papers.

40th Annual Conference on Neural Information Processing Systems (NeurIPS)Scheduled to be held in Sydney, Australia in December 2026—banned This violates confidentiality and prevents reviewers from uploading the paper they are reviewing to the AI ​​chatbot. Peer reviewers can still use AI chatbots for background investigation purposes, according to the policy outlined in . conference handbook.

To enforce the policy and catch fraudulent AI use in peer review, event organizers intentionally hid instructions regarding large-scale language models (LLMs) in papers submitted for peer review.

This instruction tells the LLM to use obvious phrases such as “.This study addresses the central question” and “Paper Claim”—In the peer review report. some researchers already caught AI tools are being used to sneak secret messages into papers in order to provide favorable reviewer reports. Many publishers prohibit the use of AI in peer review.

multiple researchers NeurIPS peer-reviewed paper used social media to express concern Regarding indirect prompt injections inserted into papers.

“Designing traps with malicious intentions in mind undermines the relationships on which the entire system depends.” Soren AuerA computer scientist at Leibniz University Hannover wrote on LinkedIn: “You cannot build a healthy peer review culture if you treat reviewers as suspects.”

However, some see merit in this approach. A similar prompt injection effort found hundreds of reviewers abusing LLM in next week’s submissions. 43rd International conference on machine learning (ICML 2026) in Seoul, South Koreaaccording to Nihal Shahcomputer scientist at Carnegie Mellon University, Scientific Integrity Chair of that conference.

in a statement to transmitter, The NeurIPS organizing committee says it cannot discuss the injected prompts in detail “without compromising the effectiveness of this intervention.”

a

you said transmitter He was assigned to review eight NeurIPS papers. He said he sometimes converts PDF files to Microsoft Word documents when conducting peer reviews, which provides some of the prompts.

Auer said he initially rejected the first paper he was reviewing because he believed the prompt had been inserted by the study’s author. But after discovering a hidden prompt in the second paper and seeing researchers discussing the issue in a Reddit thread, he removed the flag.

More papers may be rejected because reviewers don’t know the prompts were inserted by conference organizers, he says. “Personally, I don’t think it’s a good idea to ban the use of AI,” Auer added. “Of course, we need to discuss how to use it.”

The NeurIPS committee will respond directly to reviewers who notice the hidden prompts, informing them not to penalize individual papers, the statement said.

Like Auer, Sarah AtitAn AI researcher at the University of Surrey said: transmitter She found the same prompts in all four papers she reviewed on NeurIPS. She says she also discovered this in a version of her own paper that NeurIPS organizers prepared before submitting it for peer review.

Atit called the hidden prompt an “inadequate mechanism” and argued that while it may filter out some problematic submissions, it doesn’t solve the larger problem with peer review. “We blame reviewers too much because they are a visible point of failure,” she says.

But Shah says the hidden prompt is “It is doable and achievable. ”Shah led similar efforts ICML 2026 will include a hidden prompt in all submitted papers.

In doing so, Shah says he and his team have identified hundreds of auditors who used AI when they shouldn’t have, resulting in their reviews being rejected. ICML 2026 Fewer than 500 desk-rejected papers This represents approximately 2 percent of the total number of submissions received by the conference this year.

Researchers expressed “overwhelming support” for the strategy, Shah said, adding that they shared their methodology with the NeurIPS team. “I’ve been working on conference peer reviews for several years now, and I’ve rarely seen such strong support,” he says. “People were really tired of reviewers copying and pasting AI-generated reviews without any effort.”



Source link