Can AI Make Code Faster? Problem‑Oriented Optimization and Anchor Verification Breakthrough
A recent ICLR 2026 study from Zhejiang University, Ant Group, and Stony Brook introduces a problem‑oriented dataset and an anchor‑verification framework that enable large language models to not only generate correct code but also significantly improve its execution speed, achieving up to six‑fold acceleration while maintaining high correctness.
Background
Large language models can generate code from natural‑language prompts, but the generated programs often run slowly or become incorrect after naive optimization attempts. Existing code‑optimization datasets (e.g., PIE) follow a single programmer’s successive submissions, which leads to incremental, locally focused changes and makes it hard to achieve algorithmic breakthroughs such as replacing brute‑force enumeration with dynamic programming. Moreover, faster code may introduce functional regressions, a phenomenon termed the "Optimization Tax".
Problem‑Oriented Dataset
To break the cognitive inertia of a single author, the authors collect solutions from many programmers for the same programming problem and sort them by runtime, forming a genuine optimization trajectory (e.g., A1 → B1 → C1 → A2 …). This design provides three benefits:
Multi‑person wisdom : exposing the model to diverse algorithmic ideas reduces individual bias.
Data‑scale explosion : with ten contributors, the number of optimization pairs grows from dozens to hundreds.
True optimization learning : the model observes genuine algorithmic replacements rather than minor tweaks.
Anchor Verification Framework
To mitigate the Optimization Tax, a "slow but correct" reference implementation is used as an anchor. The framework consists of three steps:
Generate test inputs : the model understands the purpose of the slow code and creates a diverse set of inputs covering edge cases, exceptional inputs, and core logic.
Build a trustworthy test suite : run the slow implementation on the generated inputs to obtain correct outputs, yielding a 100 % reliable input‑output pair set.
Iterative optimization : evaluate the optimized (fast) code against this test suite; if errors appear, feed the feedback to the model for further refinement until the code is both fast and correct.
Slow code is correct; using it as a verification anchor ensures that speed gains do not sacrifice correctness.
Experimental Results
Problem‑Oriented Gains
On the Qwen2.5‑Coder 32B model, the problem‑oriented approach achieves nearly a 2× improvement over the user‑oriented baseline (BEST@1). Even with only 30 % of the original data, performance remains superior, demonstrating strong data efficiency.
Anchor Verification Gains
Combining anchor verification with a DeepSeek‑V3 backbone further boosts performance, yielding a 12.99 % increase in correctness and reaching 78.43 % optimization after five iterations, surpassing direct test‑generation methods.
Conclusion
New perspective : shifting from user‑oriented to problem‑oriented data breaks cognitive inertia and enables true algorithmic optimization.
New framework : anchor verification leverages slow, correct code as a reliable test oracle, effectively mitigating the Optimization Tax.
Empirical gains : experiments show a 71.06 % optimization rate, 6.08× speedup, and 74.54 % correctness, marking a substantial step forward for AI‑driven code optimization.
While achieving 100 % correctness remains an open challenge, the proposed methods significantly advance the capability of large language models to produce faster, reliable code.
Paper link: https://arxiv.org/abs/2406.11935Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
