Tomorrow's Physics Test: Machine Learning

Machine Learning


When Radha Mastandrea started the physics program at MIT in 2015, she knew she would need new computing skills.

As she progressed through her courses, she took classes in the computer language Python and taught herself to code. But when she landed an internship at MIT's Lincoln Laboratory, her boss asked her to take on her job, which she wasn't expecting. They asked her to train her neural network, a type of machine learning that teaches computers to process data the way her brain processes information, to identify types of stars.

Mastandrea had to learn on the fly. She spent hours searching for tutorials online and researching problems in her code. Work was frustrating and difficult.

At the end of the project, she felt relieved to go back to learning about space from textbooks and equations. She never wanted to use machine learning again, she thought.

“That was very wrong,” Mastandrea says. “Now we use machine learning every day.”

As a doctoral student at the University of California, Berkeley, Mastandrea is currently creating algorithms that can search for unknown new physics signatures in data from the Large Hadron Collider. Without machine learning, she says, this type of search, called anomaly detection, would be nearly impossible.

As Mastandrea moved from machine learning skeptic to everyday user, others in physics did the same. That meant a constant stream of interesting new challenges. “It’s very easy to get excited when everyone is working on it,” she says.

Kazuhiro Terao, a staff scientist at the U.S. Department of Energy's SLAC National Accelerator Laboratory, said this change has occurred over the past five to 10 years. The increasing use of machine learning, a part of artificial intelligence, is changing what today's students need to study, what opportunities they will have upon graduation, and how we will educate future physicists. What needs to be done is changing.

“The maturity level is still in its early stages,” Terao said. “We need to raise the bar for machine learning and statistics much higher than they are today.”

The need for data science

Physics students have always been required to be data scientists to some extent. Unlike scientists in many other fields, particle physicists and astrophysicists often see their research only as data points on a chart or computer screen. As the tasks involved in analyzing data become more complex, so do the tools that scientists need to use. One of those tools is machine learning.

Unlike traditional algorithms, machine learning algorithms devise their own rules based on the data they are given, so they can improve over time. Terao said the improvements could speed up the analysis process, which currently takes several years, and expand what physics students can accomplish in their limited time at school.

For example, when the neutrino experiment MicroBooNE began operations in 2015, scientists initially expected to get the first results quickly, Terao said. However, although the experiment was successful in collecting the data scientists needed, the results of the collaboration were not published until 2021 because it was so difficult to analyze.

For Terao, who was already an established physicist, the delay meant more time wasted developing analytical techniques. But for graduate students, whose years of research coincided with that downtime, the delay meant missing out on opportunities to contribute to new discoveries.

These long schedules “really limit the way we think” about the structure of physics education, Terao said. Faster analysis powered by machine learning could help. “I want to accelerate what we can do to help students experience more things.”

From physics to industry

Even if progress is slow, gaining experience with machine learning can open new opportunities for non-science students.

Kylie Ying majored in computer science and physics during her undergraduate studies. After her graduation, she took a job in web development and was able to focus her time on her passion: figure skating. But when the pandemic hit, she decided to complete a one-year master's program and return to her physics.

She joined a lab that collides heavy ions to recreate the incredibly hot and dense environment that existed just after the Big Bang, known as the quark-gluon plasma. To learn more about QGP's properties, scientists track particle jets as they move through them. Even though Yin had little of her machine learning experience, she worked on a project to develop an algorithm to decipher the type of particle that started the jet.

Although she loved science, she was frustrated by the fact that she felt her work was far removed from meaningful results. “Physics is really cool and really inspiring,” she says. “But for me, the impact of my hard work felt very small.”

She still found a way to use that experience. She took her learned computing skills to Wing, an autonomous drone delivery company owned by Google's parent company Alphabet, and is currently working on researching drone computer vision.

Physicist Lucas Borgna also found a way to apply his research experience outside the lab.

He majored in engineering physics as an undergraduate and spent a year working on the ATLAS experiment at the LHC through a work experience program. Later, as a graduate student at University College London, he contributed to the development of a neural network that could distinguish jets from bosons and quarks in ATLAS data.

After switching to a postdoctoral CMS experiment, Borgna did a six-week internship at a startup called Faculty, which matches science and engineering students with companies that need expertise in artificial intelligence. Borgna was matched with a London housing company.

Borgna helped the company use natural language processing to analyze data about the use of its buildings in order to improve the standard of living for its tenants. This experience opened his eyes to the possibilities of machine learning beyond physics.

“I love physics, I found it fun, and I think everything that's going to happen will be really interesting,” Borgna says. “But at the same time, other industries are using similar techniques and tools, and they are challenging and yielding many interesting results.”

Rethinking physical training

Like Yin and Borgna, Andrew Hurd first encountered machine learning at work. As a graduate student, Hurd worked with a team that created a tool to search for signatures of the Higgs boson in ATLAS data by identifying the particle's specific decay into his two-photon.

Starting in 2011, the team developed neural networks for photon identification and event classification. Suddenly, Hurd's analysis became much more sensitive to Higgs collapse. “It was just an incredibly powerful algorithm,” Hurd says. “That gave me the idea that all this talk about neural networks meant something.”

Like most physics students at the time, Hurd had no formal training in machine learning or coding and taught himself the computer languages ​​Python and C++. “It was exciting, but it was really tough,” Hurd says. “That was probably the biggest gap in my physics education.”

As part of his PhD, Borgna ended up taking courses in both machine learning and data science, which he found beneficial. Having formal course requirements is “very helpful given the prevalence and frequency of use of these algorithms,” Borgna says.

To ensure students receive the training they need, SLAC's Terao advocates for all physics programs to take a more structured approach to machine learning education.

Students can only learn through experimentation and practice. To truly take advantage of machine learning, Terao says you also need to have a deep understanding of the statistics and mathematics behind the algorithms. “We as a field need to offer these courses,” he says. “They are essentially becoming a requirement for doctoral research.”

In the summer of 2023, Terao co-hosted the annual SLAC Summer School Institute, whose theme for the year focused on “Artificial Intelligence in Fundamental Physics.” Unlike previous years, which were held around a variety of themes, many of the speakers were in the early stages of their careers, Terao said. The main reason for this is that the latest developments in machine learning are very novel even to senior scientists.

Terao says the workshop was a good start to providing students with a structured opportunity to learn about physics and machine learning. But more needs to be done in this area.

“The important thing for us is to give them more opportunities to challenge themselves,” Terao says. “We need to provide access to many research questions, share many datasets, and set the stage for new people.”

Other major physics organizations have begun offering training as well. In 2022, the American Physical Society's Data Science Group began hiring physics PhD students to write introductory machine learning tutorials based on Google Golab, an open source, shareable software platform. Mastandrea said scientists are creating similar resources at CERN.

“People really want undergraduates to actually use these tools,” Mastandrea said. “The sooner you start using it, the more fluent you'll become.”

various occasions

As a PhD student working on the physics of ATLAS, Sean Gasiorowski used machine learning to better characterize the background clutter in the search for pairs of Higgs bosons, and to preparation for signal analysis.

He didn't intend to focus on machine learning. However, the more he used it, the more he was attracted to it. “Once the pipeline is integrated and we get the results we want, it's very satisfying,” he says.

Machine learning skills can help students land positions in physics labs, especially now that funding agencies are offering grants to physicists to encourage research in machine learning. But physics labs aren't the only ones hiring.

Gasiorowski is currently a postdoctoral fellow in SLAC's Machine Learning Group, working on multi-program national lab-wide projects. Part of his research is collaborating with Terao on analytical tools for deep underground neutrino experiments, but he also supports researchers in areas such as materials science.

Whether or not they stay in academia, students gain unique expertise by developing machine learning skills in the context of particle physics and astrophysics.

Part of that is the sheer amount of data you have to work with. In some research fields, data scientists may only have hundreds or thousands of data points to train their models. With telescopes that take hundreds of high-resolution images of the sky each night and accelerator experiments that observe hundreds of millions of collisions every second, scientists have access to even more information. “How good a model is is usually fairly directly related to the amount of data required for training,” Gasiorowski says.

Many of the skills physicists need to learn can easily be transferred to other industries, Yin says. For example, physicists must learn how to identify high-quality information. Researchers must learn how to formulate research questions accurately and ensure that the available data answer those questions. And physicists need to learn to understand the mathematics behind the analysis, which helps them understand the principles behind neural networks, Hurd says.

Mastandrea is eyeing a new kind of career path. He hopes to become a co-professor of physics and computing, an unusual position the university has begun advertising in recent years.

At the end of 2023, Mastandrea attended a relatively new workshop called “Hammer and Nails.” This workshop explores how current problems in physics (the “nail”) can be addressed with machine learning architectures (the “hammer”). The program was founded in 2017 and has only been held four times so far.

“It's a really exciting time to be on the scene,” Mastandrea said. “We have more tools than we know what to do with.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *