How DeepMind’s AlphaTensor AI Invents Faster Matrix Multiplication

After developing an artificial intelligence capable of achieving superhuman dominance in games like chess and Go, along with another AI that could predict how proteins would fold in three-dimensional space, DeepMind researchers once again did it. Train AI models to efficiently solve basic math problems and break 50-year records.

In a blog post earlier this month, the DeepMind team introduced AlphaTensor. This is an AI system designed to discover new and more efficient algorithms for solving important mathematical operations, in this case matrix multiplication.

Matrix multiplication underpins much of modern computing, including processing and compressing images and videos, recognizing voice commands, and running simulations for weather forecasting.

So it’s no wonder professionals and companies around the world are constantly looking for more efficient ways to improve the algorithms for solving these mathematical operations behind such tasks.

Matrix multiplication is one of the simplest mathematical operations in algebra, where individual numbers placed in a grid (or matrix) are multiplied and added in a specific way to produce a new matrix.

Such matrices are used to represent various kinds of data, such as sets of pixels in an image or internal features of artificial neural networks.

For centuries, mathematicians used what they believed to be the most efficient method, but in 1969 the German mathematician Volker Strassen discovered that it had a 7-fold instead of the standard 8-fold. We rocked the math world with an even better way to multiply pairs of 2×2 matrices using multiplication. .

Strassen’s record lasted over 50 years, but DeepMind’s AlphaTensor was able to show itself that it could find an even more efficient method.

In fact, the team approached the matrix multiplication problem like a game, and AlphaTensor built on lessons learned from its game-playing predecessor, AlphaZero.

Both models use a type of machine learning called reinforcement learning, in addition to the Monte Carlo Tree Search (MCTS) technique, so that the system receives feedback from previous “movements” as it plays the “game” and gradually can learn to improve — whether it’s chess or matrix multiplication.

For AlphaTensor, the team reformulated the problem of finding an efficient algorithm for matrix multiplication as a single-player game in which the “board” is transformed as a 3D array of numbers.

To reach the goal of zeroing out all numbers in the fewest number of steps, the model must choose from a collection of allowed moves to correctly fill the grid of numbers. This is ultimately what the team calls “a provably correct matrix multiplication algorithm for any pair of matrices, whose efficiency depends on the number of steps taken to zero out the entries in the output matrix. will be captured by”.

Each time the system succeeds, it updates its internal parameters, increasing its chances of success again. At the same time, Monte Carlo tree search techniques help predict how successful different paths to possible solutions will be. This allows more favorable paths to be prioritized and game results fed back to the network to improve the system. farther away.

“We trained the AlphaTensor agent using reinforcement learning and played the game without any knowledge of existing matrix multiplication algorithms,” the team explained.

“Through learning, AlphaTensor has gradually improved over time, rediscovering historic fast matrix multiplication algorithms like Strassen, and ultimately surpassing the realm of human intuition, and more than previously known. It discovers algorithms even faster.”

The team emphasized the difficulty of the seemingly simple problem of multiplying two matrices.

Compared to Go, which has challenged AI for decades, the number of possible moves at each step of the game is 30 orders of magnitude greater (more than 10^).³³ one of the settings to consider). Basically, to play this game well, you have to identify the tiniest needle in a huge haystack of possibilities. ”

During experiments testing input matrices up to 5×5, the team not only “rediscovered” shortcuts for matrix multiplication previously demonstrated by AlphaTensor, but also new ways to perform these computations efficiently. I discovered what I discovered.

For example, AlphaTensor was able to find an algorithm for multiplying a 4×5 matrix by a 5×5 matrix. It outperformed the previous algorithm, which required 80 multiplications, and only 76 multiplications.

A larger set of 11×12 and 12×12 matrices allowed AlphaTensor to reduce the number of required multiplications from 1,022 to 990. AlphaTensor can also optimize matrix multiplication for specific hardware, and the team he trains the system on two different processors. Ensure that each processor is optimized for performance.

Ultimately, the team believes the new research could have a major impact on fields ranging from mathematical research to computing.

“From a mathematical point of view, our results can guide further work in complexity theory aimed at determining the fastest algorithms for solving computational problems. By exploring the space of possible algorithms in a similar way, AlphaTensor It helps to develop a better understanding of the richness of matrix multiplication algorithms.

Understanding this space may lead to new results that help determine the asymptotic complexity of matrix multiplication, one of the most fundamental open problems in computer science. Since matrix multiplication is a core component of many computational tasks spanning computer graphics, digital communications, neural network training, and scientific computing, the algorithms AlphaTensor discovered greatly streamline computation in these areas. There is likely to be. ”

Source link