Why Compression Is the Core of Mathematics and AI, According to Michael Freedman
In an interview, Fields Medalist Michael Freedman explains how his new paper shows that mathematical reasoning is fundamentally about compression, presents statistical analysis of the Lean mathlib library revealing astronomically large unfolded statements versus compact token representations, and argues that understanding this compression is key to future human‑AI collaboration in mathematics.
Fields Medalist Michael Freedman, known for solving the four‑dimensional Poincaré conjecture, has turned his attention to the intersection of mathematics and artificial intelligence. In his recent arXiv paper (https://arxiv.org/pdf/2603.20396), he declares that “compression is all you need,” positioning compression as the essence of both human mathematical practice and AI reasoning.
Freedman and his team examined the Lean formal‑verification library mathlib (≈500 k lines of code) as a concrete model of “human mathematics.” By tracing how theorems invoke lemmas and definitions, they built a tree representation of each statement. The analysis showed that a proposition that can be written in about 600 tokens expands to a size on the order of 10<sup>104</sup> when fully unwrapped—far larger than a googol (10<sup>100</sup>). This dramatic expansion illustrates the power of compression: a compact high‑level concept encodes an enormous amount of underlying detail.
The interview highlights several concrete examples. Freedman recalls a freshman differential‑equation lecture where a single symbol Ω represented a “sheaf of germs of sections of a vector bundle,” which implicitly contains layers of concepts such as natural numbers, integers, rational numbers, real numbers, vector spaces, and manifolds. He argues that mathematicians routinely operate several abstraction layers above the raw data, achieving compression that AI systems, which tend to enumerate possibilities, lack.
To formalize compression, the authors introduce two metrics. “Reductive compression” is the ratio of the unfolded length to the compressed length, indicating how much a statement has been abstracted. “Deductive compression” compares proof length to statement length, measuring how much mathematical work is packed into a theorem (e.g., Fermat’s Last Theorem can be stated succinctly but requires hundreds of pages of proof). These metrics can be computed locally on the proof graph and used by AI agents to navigate the “landscape” of mathematical reasoning.
Freedman also discusses the role of “macros” in a monoid model of mathematics. A monoid (a set with an associative binary operation and an identity, but not necessarily inverses) can be enriched with macros—high‑level abstractions such as “powers of ten” or the four‑square theorem—that dramatically increase compression. Empirically, the paper finds that monoids with polynomial growth are highly compressible, whereas those with exponential growth resist compression, leading to the conjecture that mathematical structures are fundamentally polynomial.
The interview touches on practical tools for identifying high‑centrality nodes in the proof graph, mentioning PageRank‑style algorithms and simpler indicators derived from the two compression metrics. Recognizing these nodes could enable new modes of human‑AI collaboration, allowing AI to focus on less compressible regions while humans supply the intuitive “mathematical taste” that guides the search.
Overall, Freedman’s work frames mathematics as a long history of compression—from ancient place‑value notation to modern formal libraries—and suggests that mastering this compression is essential for building AI systems that can meaningfully participate in mathematical discovery.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
