How LoongFlow Enables Expert‑Level AI Agents to Outperform Human Mathematicians
LoongFlow is an open‑source AI agent framework that combines a Plan‑Execute‑Summarize (PES) paradigm with a Hybrid Evolutionary Memory system, allowing agents to perform directed, iterative problem solving and achieve state‑of‑the‑art results on mathematical challenges, Kaggle‑style benchmarks, and real‑world tasks with dramatically higher efficiency.
Overview
LoongFlow is an open‑source framework for building AI agents that perform expert‑level reasoning. It introduces a systematic “Plan‑Execute‑Summarize” (PES) cycle and a Hybrid Evolutionary Memory to guide directed evolution of solution populations.
Key Components
PES paradigm Each iteration consists of:
Plan : Analyze the current solution pool, retrieve relevant experience from a strategic knowledge base, and generate a concrete evolution plan.
Execute : Dynamically select tools (logical verifier, code interpreter, data‑query generator, etc.) and perform fast local validation of candidate solutions.
Summarize : Compare execution outcomes with the plan, extract causal insights, and store them back into the memory.
Hybrid Evolutionary Memory A multi‑island experience repository that archives solutions with rich metadata, supports MAP‑Elites archiving, and uses adaptive Boltzmann selection to balance exploration and exploitation.
Efficiency Mechanisms
The structured PES cycle turns random search into directed exploration, reducing wasted evaluations by roughly 60 % and achieving near‑certain convergence (iteration success rate ≈ 100 %).
Benchmark Results
Mathematical challenges : On 11 problems from the Tao‑Zhexuan/AlphaEvolve benchmark LoongFlow agents surpassed the best known human results; on 7 problems they outperformed Google AlphaEvolve, establishing new state‑of‑the‑art.
MLE‑bench (Kaggle‑style) : A machine‑learning agent built with LoongFlow earned 23 gold medals across tasks such as pathology cancer detection and volcanic eruption prediction.
Evolution efficiency : Compared with OpenEvolve and ShinkaEvolve, LoongFlow improved efficiency by > 60 % while maintaining 100 % iteration success.
Example Application
In the “circle‑packing” problem (arranging non‑overlapping circles to maximize coverage within a shape), LoongFlow discovered arrangements that were more compact than those found by human mathematicians after years of research and by the AlphaEvolve system.
Open‑Source Release
The source code, documentation, and example agents are available at https://github.com/baidu-baige/LoongFlow. A detailed technical report describing the design can be accessed at https://arxiv.org/abs/2512.24077.
Baidu Intelligent Cloud Tech Hub
We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
