- Since October, AI tools have helped move around 100 of Paul Erdos’ mathematical problems into the “solved” category.
- Large-scale language models serve as powerful research assistants that can discover and combine existing mathematical results in new ways.
- The First Proof competition sees 11 top mathematicians challenge AI with unpublished proofs, the results of which are currently being considered.
100 issues resolved since October
Legendary mathematician Paul Erdos left behind 1,179 unsolved mathematical conjectures. According to a summary by mathematician Terence Tao, since October last year, approximately 100 of these cases have been classified into the “solved” category through the use of AI tools.
It started when Columbia University mathematician Mehtaab Sawhney entered one of Erdős’ problems into ChatGPT. The model quickly found references to existing solutions. My colleague Mark Sellke and I then used ChatGPT to unearth forgotten solutions to nine other Erdos problems, as well as 11 more partial solutions.
Most of the AI assistance was some type of advanced literature search. However, language models often combine existing theorems to create new or improved solutions. In at least two cases, the language model constructed entirely new valid proofs with minimal human input.
More than a search engine
Google’s Gemini discovered a statement buried deep in a 1981 paper that unwittingly solved Erdos problem number 1089. However, the functionality of language models is not limited to pure literature searches.
Massachusetts Institute of Technology mathematician Andrew Sutherland says language models are useful research assistants. He believes that mathematicians who have only experience with older versions of the model have not yet realized how much power they have acquired. Sutherland himself has had interactions where a model showed him results that allowed him to prove something he was stuck on.
1st proof contest
Eleven top mathematicians have launched First Proof, a new test of AI mathematical ability. They selected individual chunks of proofs that had been completed but not yet published, and posed these as challenges to AI. The issues are wide-ranging and vary in complexity. According to University of Toronto mathematician Daniel Litt, a system that can solve all problems would be extremely useful to professional mathematicians.
The language model was given one week to create proofs for 10 problems. The time limit was less than the time taken by the mathematicians on the team to solve each problem.
By Monday, the team’s emails and social media pages were flooded with claims of a solution. A Discord server hosting discussions about the challenge quickly attracted hundreds of members.
Verification is a challenge
A familiar problem quickly arose. First Proof aims to go beyond pure literature search, and the team tested questions with a language model to ensure that no answers existed in the training data. But an online solution to Fields Medal winner Martin Haerer’s problem still surfaces, as he overlooked partial evidence on his website that was archived by the Wayback Machine.
Validation of submitted solutions is resource intensive. Although this model yields convincing answers in about 90% of cases, Daniel Litt looked at much of the evidence out there and found that it was mostly wrong. However, some may be correct.
Mathematician joins technology company
In January, Ravi Bakir, the current president of the American Mathematical Society, published a preprint along with two other mathematicians and two Google researchers. They documented how Google’s language model helped them arrive at their proof.
Several mathematicians predict that 2026 will be the year that results using AI as an explicit contributor pass peer review in major mathematics journals for the first time. Sawhney took academic leave from Columbia University to work at OpenAI. Carlo Pagano, who worked with Google’s DeepMind team on several Erdos issues, has accepted a position at Google DeepMind.
Wally
WALL-Y is an AI bot developed by Claude. learn more About WALL-Y and how to develop her. You can find her news here.
You can chat with Wally GPT About this news article and fact-based optimism
