Mathematicians Declare an AI Turning Point in Mathematics

The article surveys recent observations from leading mathematicians who report that AI breakthroughs—ranging from solving most IMO problems in 2025 to accelerating research with systems like AlphaEvolve—signal a decisive turning point in how mathematics is explored, proved, and taught.

Data Party THU
Data Party THU
Data Party THU
Mathematicians Declare an AI Turning Point in Mathematics

AI performance on competition problems

In July 2025 multiple AI models solved five of the six problems from the International Mathematical Olympiad (IMO). The problems had known solutions, but the speed and accuracy of the models surprised the community.

Shift in research practice

Mathematicians who previously doubted AI began using it as a research tool. Terence Tao described 2025 as the year AI started to act on many different mathematical tasks, noting that work that previously required weeks or months could be completed in a single day.

First Proof challenge

In February 2026 the “First Proof” competition presented ten research‑level problems deliberately chosen to be unlikely to appear in training data. Over half of the problems were solved autonomously by AI models, demonstrating a transition from Olympiad‑level to graduate‑level capability.

AlphaEvolve system

In January 2025 Tao and Javier Gómez‑Serrano partnered with DeepMind researchers Adam Wagner and Bogdan Georgiev to build AlphaEvolve. The workflow uses Google Gemini to generate Python programs that can be hundreds of lines long; a genetic‑algorithm loop then mutates and selects programs to improve performance on a given mathematical problem. Between January and late May the team evaluated AlphaEvolve on 67 problems across several domains. The system improved the known optimal solution on 23 problems, matched existing results on 36 problems, and fell short on the remaining cases.

Optimization case study

Ernest Ryu (UCLA) applied ChatGPT to a long‑standing optimization problem originally posed by Yurii Nesterov in 1983. Initial model outputs contained many incorrect proof steps, but also useful fragments. Ryu iteratively extracted the correct parts, refined prompts, and within roughly 12 hours produced a simplified proof. After a few more days he completed a full convergence proof for Nesterov’s method, a result he judged publishable in a top optimization journal.

Discovery of hidden structure in permutation groups

Using AlphaEvolve, researchers examined Bruhat intervals in permutation groups. The AI generated about 50 lines of Python code; when the permutation size was a power of two the code collapsed to five lines, revealing that the corresponding Bruhat intervals form hypercube structures. This pattern had been unnoticed for fifty years.

Application in algebraic geometry

Ravi Vakil (Stanford) collaborated with DeepMind’s Gemini modules—DeepThink and FullProof—to study embeddings of flag bundles. The AI‑assisted computation quickly confirmed the conjectured rapid convergence of the associated polynomial spaces.

Risks and challenges

Several mathematicians warned of “hallucinated nonsense” generated by LLMs, which can pollute the public discourse. Concerns were raised about the impact of AI‑generated mathematics on formal proof verification and on the training of future mathematicians. Daniel Litt emphasized the need to monitor the quality of AI output, while Akshay Venkatesh cautioned that reliance on powerful AI tools might erode mathematicians’ direct experiential understanding of the subject.

Code example

来源:ScienceAI
本文
约4500字
,建议阅读
5
分钟
这场AI变革,或许还有很多可能性。
OptimizationAIlarge language modelsMathematicsAlphaEvolveMathematical Research
Data Party THU
Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.