More than half of researchers are now using AI for peer review, often against guidance.

Applications of AI


A mechanical hand on a white prop holds a magnifying glass over a ring binder of paper documents.

Our findings suggest that reviewers are increasingly paying attention to AI.Credit: Panther Media Global/Alamy

More than 50% of researchers have used artificial intelligence during manuscript review, according to a survey of nearly 1,600 academics from 111 countries conducted by publisher Frontiers.

Almost a quarter of respondents said the use of AI for peer review has increased in the past year. The findings, posted by the Lausanne, Switzerland-based publisher on December 11, confirm what many researchers have long suspected, given the proliferation of tools that utilize large-scale language models such as ChatGPT.

“It’s good to face the reality that people are using AI in their peer review work,” says Elena Vicario, Director of Research Integrity at Frontiers. However, she added that the poll suggests that researchers are using AI for peer review “contrary to many external recommendations not to upload papers to third-party tools”.

AI assistance. The diagram shows which aspects of peer review researchers used AI tools. The results come from a Frontiers survey.

sauce: Unlock AI'untapped potentialfrontier

Some publishers, including Frontiers, allow limited use of AI in peer review but require reviewers to disclose the AI. Like most other publishers, Frontiers prohibits reviewers from uploading unpublished manuscripts to its chatbot website due to concerns about confidentiality, sensitive data, and infringement of authors' intellectual property.

The report calls on publishers to respond to the increasing use of AI across scientific publishing and introduce policies that are better suited to the “new reality.” Frontiers itself has launched an internal AI platform for all journal reviewers. “AI should be used responsibly in peer review, with very clear guidance, human accountability, and proper training,” says Vicario.

“We agree that publishers can and should proactively and reliably communicate best practices, particularly disclosure requirements that enhance transparency to support the responsible use of AI,” a spokesperson for Hoboken, New Jersey-based publisher Wiley said in a statement. In a similar study published earlier this year, Wiley added that researchers found “relatively low interest and confidence in AI use cases for peer review.” “I don't see anything in our portfolio that contradicts this.”

Investigate, search, and summarize

The Frontiers survey found that among respondents who use AI for peer review, 59% use AI to create peer review reports. 29% said they use it to summarize manuscripts, identify gaps, and check references. Additionally, 28% use AI to flag potential signs of fraud, such as plagiarism or image duplication (see AI-assisted).

Mohammad Hosseini, a research ethics and integrity researcher at Northwestern University Feinberg School of Medicine in Chicago, Illinois, said the study was “a good attempt to gauge the acceptability of using AI in peer review and the prevalence of its use in different settings.”

Some researchers are running their own tests to determine how well their AI models support peer review. Last month, Mim Rahimi, an engineering scientist at the University of Houston in Texas, designed an experiment to test whether the large-scale language model (LLM) GPT-5 can review language models. nature communications paper1 He co-authored.

He used four different settings, ranging from entering a basic prompt asking the LLM to review a paper without additional context, to providing the LLM with a research paper from the literature to help assess the novelty and rigor of the paper. Rahimi then compared the AI-generated output with the actual peer review reports he received from the journal and discussed his findings in a YouTube video.

His experiments showed that although GPT-5 mimics the structure of peer review reports and can use sophisticated language, it fails to generate constructive feedback and has factual errors. Using advanced prompts did not improve AI performance. In fact, the most complex setups produced the weakest peer reviews. Another study found that AI-generated reviews of 20 manuscripts tended to match human reviews, but were insufficient to provide detailed critiques.



Source link