Looting Journals Flagged by a New AI Detection Tool

Scientists may be familiar with the feeling of opening their email inbox to find messages from journals they have never encountered before, and providing services to publish their research.

Today's “It will be public or rotten.” Culture, this offer may sound appealing – there is no service (peer reviews or editing) that the author expects until you get hit with a considerable (often hidden) fee.

Journals with these low editing standards, known as “predatory” journals, are increasingly raising questions for scientists. Even if their actions are not necessarily predatory from a financial standpoint, their unethical approach to publishing large quantities of unreviewed research can make it difficult for scientists to find quality information.

Establishing whether a journal is a reputable publication can be a challenging task. A journal website with questionable editing practices may look completely professional on the surface, but dig deeper and you may reveal a web of self-awareness and other poor practices.

Now, computer scientists have developed a new AI-powered tool that can automatically flag suspicious journals.

In a paper published in Advances in sciencethe team used AI to sift through a list of nearly 15,200 open access journals available on the Internet. From these, the tool flagged approximately 1,400 journals as potentially problematic. Subsequent audits from human experts determined that while AI made mistakes, more than 1,000 flagged journals actually showed some form of suspicious practice.

For more information about this tool and the dangers of such journals, please see below. Technology Network I spoke with the lead author of this study. Daniel AcknaAssociate Professor of Computer Science at the University of Colorado Boulder University.

Alexander Beadle (AB): How do you define a “suspectful” journal?

Daniel Ackna (da): This can be a rather subtle subject, so I really wanted to be careful about how I presented this study. There has been a lot of controversy about calling the journal “predatory,” but this is clearly a very loaded term.

We wanted to stick to definitions that rely on highly respected institutions, Open Access Journal Directory (DOAJ). They don't call them questionable journals, but essentially, if Doaj is removing the journals from their list, the reason they provide is Certain suspicious practices, Label them as “suspecting” so that they do not provide information to the editorial board.

More broadly, a questionable journal is a journal that does not serve the intended purpose of the journal, a repository of human knowledge that has experienced some filtering and evaluation from its peers, all transparent and free from conflicts of interest.

AB: What risks do questionable journals pose?

DA: I'm a scientist, so I'm always looking for things. Indexing services such as Google Scholar do not actually filter where information is extracted, effectively contaminating other people's research with the risks of these journals.

As the proverb says, “Science is built on the shoulders of giants.” But in reality, it's more than a giant. It is built on the shoulders of little people all over the world, built on what they discover, and trusting that the research they saw has passed some kind of filter.

Of course, that's not always the case. I sometimes choose to rely on my previous print publication, so I need to write a little more. But you at least assume that there is good intention behind what you are reading, and people don't just reveal their ways.

You can also include publications in suspicious journals. That doesn't mean that every paper is problematic, but that it didn't go through what the scientific community considers to be an appropriate process for publishing that knowledge.

AB: The paper presents AI-driven tools that can help flag suspicious journals. How does this tool evaluate whether the journal follows best practices?

DA: Attempts to evaluate journals with AI that Previously, the problem we saw with these previous approaches is that they act as “black boxes.” It is not clear how AI is making these decisions.

I wanted to connect to real metrics using AI, such as the best practices outlined by Doaj. When our AI says that the journal has a 99% chance of being suspicious, we understand why it thinks about it.

Doaj, for example, says that a journal requires an advisory body or editorial board, and it must list the board of directors, and must consist of people who have in some way some connection to the field of science. Visit the journal's editorial board website page and rub it. Next, check whether these officers exist in the publications, authors and scientists' databases.

It uses three main feature sets. First, the appearance of the website. Is there any appropriate content (i.e., do you have a page listing edit boards) and how it looks aesthetic (i.e., it looks professional). You will also look at the code on your website to see how it was built. There is research in there suggesting that companies are emerging from software to generate journals, suggesting that when they get caught they will start something new using the same underlying code.

The final set of features relates to what is called “bibliographic patterns,” including frequent self-citations, who the author is, many quotes from the same institution or journal, and the frequency of citations.

AB: How accurate is this tool running?

DA: Surprisingly, I thought we needed all the feature set built in to get good results. But in reality, you never get all of this. Website scraping is not always possible due to capture, blocked content, etc. Focusing on the textbook and removing the hard-to-captured things was not as good as we had performance together, but we were able to take this model and apply it to a larger dataset that we had not previously analyzed.

With the grand plan of things, there is a false positive rate of around 24%. So, out of the 15,000 journals we've seen, when AI predicts that around 1,400 are suspicious, it turns out that over 300 is not really a problem. Of course, that means some wasted time for human auditors, but we can also catch around 1000 suspicious journals.

False positive rates are something you can tweak in the sense that the model tells you that there may be something wrong with it and that users choose how strict the bar they want to be. Let's say you want to be like the Transportation Security Agency (TSA) at the airport. Let's say you want to catch all the bad actors and be very strict. Just like airports with incredibly low thresholds, you'll accidentally flag many ordinary people, but the advantage is that you can catch all the bad guys.

AB: Why is it useful to have this kind of tool rather than relying solely on these human auditors?

DA: It's certainly a volume issue, but the other problem is that these journals constantly adapt and change their strategy. It's also good to have AI that can quickly adapt, gamify the system, and catch these journals that mimic real websites. Naturally, people make errors from time to time, but some things are just very difficult to catch. There are cases where a website may not look that great, but if it is on a very vague topic, it can be a very vague yet legitimate journal.

Another important point is that Doaj is a community-driven organization. This means that we can systematically analyze a very small number of journals each year. This is a long-lasting task.

AB: Are there any plans to make the tool more widely available one day? How is this technology applied over the long term?

DA: Since starting this research many years ago, I have started a company that helps publishers have research integrity issues – Reviewerzero AI – And all of this is disclosed in the paper. But yes, [widening availability] It's one of the things we want to do. At this time, the tool is available through Reviewerzero AI services for those who are experts in the field, such as research integrity officers.

I've updated the tool with other signals and should get better over time. Once you enter into a state that may be shared with the author, it will be an interesting application to help them rule out journals that are not suitable for their work. Many of the authors featured in these questionable journals are from the Global South, so this tool can be shared with universities there so that scientists can help scientists examine the journals they submit their works.

Interviewees:

Dr. Daniel Acuna is an associate professor of computer science at the University of Colorado, Boulder, and is currently on entrepreneurial leave. He is the founder of Reviewerzero AI, an AI-driven platform that identifies issues such as image tampering, citation manipulation, and statistical anomalies to protect the integrity of research.

At Academia, ACUNA conducts extensive research into research integrity and publishes over 70 peer-reviewed articles. His research is supported by the Sco-Ini initiative of the U.S. Research Bureau, the National Science Foundation, the Sloan Foundation and the DARPA. In 2021, he organized the Computational Research Integrity Conference (CRI-CONF.ORG). This addressed researchers, ethicists, technical experts, researchers, ethicists, and technical experts to real-world issues in the integrity of research.

reference: Zhuang H, Liang L, Acuna de. Estimate predictability of suspicious open access journals. Sci Adv. 2025; 11 (35): EADT2792. doi:10.1126/sciadv.adt2792

Source link