AI could discover new physics faster, but there are surprising pitfalls

Artificial intelligence is already playing a major role in helping cosmologists study the universe. Now, new research suggests that a machine learning technique called transfer learning could make exploring new physics much faster and cheaper. But the study also revealed some surprising shortcomings. AI can become too reliant on what it has already learned and have a hard time recognizing anything truly new.

This research Journal of Cosmology and Astroparticle Physics (JCAP) considered how transfer learning can help researchers study theories beyond standard cosmological models.

Exploring AI and new physics

The current standard cosmological model, known as ΛCDM, successfully explains many large-scale features of the universe, such as the expansion of the universe and the distribution of galaxies. But scientists believe this model is not the final answer.

Recent observations have raised questions that could point to new physics, including the effects of massive neutrinos, modified gravity, and evolving dark energy. Exploring these possibilities requires researchers to generate vast numbers of detailed computer simulations, each representing a virtual universe constructed using different physical assumptions.

Creating these simulations is computationally expensive and often requires significant computational power.

Reduce simulation costs using transfer learning

Researchers investigated whether transfer learning could make this process more efficient.

Transfer learning allows AI systems to apply knowledge gained from one task to another related task. Instead of training the entire neural network with the most complex and computationally expensive simulations, the team first trained the neural network with a simple simulation based on ΛCDM. This initial stage, known as pre-training, is followed by additional training with more sophisticated models that include potentially new physics.

“It’s basically a shortcut,” explains Adrian Beyer, a cosmologist at the Flatiron Institute and Princeton University and co-author of the study. “Typically, people train AI directly on the most computationally expensive simulations. What we do instead is first use simple, low-cost ΛCDM simulations to make the AI aware of what’s going on, and then move on to more complex models.”

Beyer compares approaches to learning from textbooks.

“We start by reading basic books to understand the knowledge, and then we move on to very complex books,” Beyer says.

This strategy eliminates the need for the AI to “digest everything at once,” said lead author Veena Krishnaraj, an undergraduate student at Princeton University.

The results were amazing. In some cases, transfer learning has reduced the number of expensive simulations required by more than 10 times.

When prior knowledge matters

The study also uncovered a less obvious challenge known as negative transfer.

Imagine learning medicine from an introductory text, using Beyer’s textbook comparison, and then encountering a rare disease that closely resembles common symptoms. Existing knowledge is usually helpful, but in some cases it can encourage erroneous conclusions.

The same problem can occur with AI systems.

In some cases, signs of new physics may resemble patterns that the AI already associates with standard cosmological models. The pre-trained network may then interpret unknown information through the lens of known information, making it difficult to recognize truly novel effects.

Researchers confirmed this effect while studying simulations involving giant neutrinos. Some of the observed signatures related to neutrino mass are very similar to changes related to an existing ΛCDM parameter called σ8, which measures the strength of matter clusters throughout the universe.

Because of this similarity, pretrained neural networks initially had difficulty distinguishing between the two effects.

“Negative transitions are not random; they are caused by the underlying physical degeneracy of the model,” says Krishnaraj.

In other words, different physical processes can produce very similar observable signatures, making it difficult for AI to pinpoint which parameter is responsible.

“So this is something that we need to be aware of and try to mitigate,” she concluded.

Expectations and risks for future cosmology

The findings highlighted both the potential benefits and limitations of applying fundamental model concepts to physics. These approaches are broadly similar in spirit to the techniques behind modern generative AI systems and large-scale language models.

As the researchers point out in their paper, pre-training can speed up inference, “but it can also impede learning new physics.”

So far, this approach has only been tested using simulation. The next step is to apply it to actual astronomical observations.

The research team believes that transfer learning has the potential to become an important tool for future cosmological investigations, with the hope that an unprecedented amount of highly accurate data about the universe will be collected within the next few years.

The paper “Transfer Learning Beyond the Standard Model” by Veena Krishnaraj, Adrian E. Bayer, Christian Kragh Jespersen, and Peter Melchior is currently JSTAT.

Source link