Artificial Intelligence 15 min read

How LoongFlow Enables Expert‑Level AI Agents to Outperform Human Mathematicians

LoongFlow is an open‑source AI agent framework that combines a Plan‑Execute‑Summarize (PES) paradigm with a Hybrid Evolutionary Memory system, allowing agents to perform directed, iterative problem solving and achieve state‑of‑the‑art results on mathematical challenges, Kaggle‑style benchmarks, and real‑world tasks with dramatically higher efficiency.

Baidu Intelligent Cloud Tech Hub

Jan 20, 2026

How LoongFlow Enables Expert‑Level AI Agents to Outperform Human Mathematicians

Overview

LoongFlow is an open‑source framework for building AI agents that perform expert‑level reasoning. It introduces a systematic “Plan‑Execute‑Summarize” (PES) cycle and a Hybrid Evolutionary Memory to guide directed evolution of solution populations.

Key Components

PES paradigm Each iteration consists of:

Plan : Analyze the current solution pool, retrieve relevant experience from a strategic knowledge base, and generate a concrete evolution plan.

Execute : Dynamically select tools (logical verifier, code interpreter, data‑query generator, etc.) and perform fast local validation of candidate solutions.

Summarize : Compare execution outcomes with the plan, extract causal insights, and store them back into the memory.

Hybrid Evolutionary Memory A multi‑island experience repository that archives solutions with rich metadata, supports MAP‑Elites archiving, and uses adaptive Boltzmann selection to balance exploration and exploitation.

Efficiency Mechanisms

The structured PES cycle turns random search into directed exploration, reducing wasted evaluations by roughly 60 % and achieving near‑certain convergence (iteration success rate ≈ 100 %).

Benchmark Results

Mathematical challenges : On 11 problems from the Tao‑Zhexuan/AlphaEvolve benchmark LoongFlow agents surpassed the best known human results; on 7 problems they outperformed Google AlphaEvolve, establishing new state‑of‑the‑art.

MLE‑bench (Kaggle‑style) : A machine‑learning agent built with LoongFlow earned 23 gold medals across tasks such as pathology cancer detection and volcanic eruption prediction.

Evolution efficiency : Compared with OpenEvolve and ShinkaEvolve, LoongFlow improved efficiency by > 60 % while maintaining 100 % iteration success.

Example Application

In the “circle‑packing” problem (arranging non‑overlapping circles to maximize coverage within a shape), LoongFlow discovered arrangements that were more compact than those found by human mathematicians after years of research and by the AlphaEvolve system.

Open‑Source Release

The source code, documentation, and example agents are available at https://github.com/baidu-baige/LoongFlow. A detailed technical report describing the design can be accessed at https://arxiv.org/abs/2512.24077.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Benchmarking Evolutionary Algorithms expert reasoning LoongFlow

Written by

Baidu Intelligent Cloud Tech Hub

We share the cloud tech topics you care about. Feel free to leave a message and tell us what you'd like to learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.