Can AI Make Code Faster? Problem‑Oriented Optimization and Anchor Verification Breakthrough

A recent ICLR 2026 study from Zhejiang University, Ant Group, and Stony Brook introduces a problem‑oriented dataset and an anchor‑verification framework that enable large language models to not only generate correct code but also significantly improve its execution speed, achieving up to six‑fold acceleration while maintaining high correctness.

PaperAgent
PaperAgent
PaperAgent
Can AI Make Code Faster? Problem‑Oriented Optimization and Anchor Verification Breakthrough

Background

Large language models can generate code from natural‑language prompts, but the generated programs often run slowly or become incorrect after naive optimization attempts. Existing code‑optimization datasets (e.g., PIE) follow a single programmer’s successive submissions, which leads to incremental, locally focused changes and makes it hard to achieve algorithmic breakthroughs such as replacing brute‑force enumeration with dynamic programming. Moreover, faster code may introduce functional regressions, a phenomenon termed the "Optimization Tax".

Problem‑Oriented Dataset

To break the cognitive inertia of a single author, the authors collect solutions from many programmers for the same programming problem and sort them by runtime, forming a genuine optimization trajectory (e.g., A1 → B1 → C1 → A2 …). This design provides three benefits:

Multi‑person wisdom : exposing the model to diverse algorithmic ideas reduces individual bias.

Data‑scale explosion : with ten contributors, the number of optimization pairs grows from dozens to hundreds.

True optimization learning : the model observes genuine algorithmic replacements rather than minor tweaks.

Anchor Verification Framework

To mitigate the Optimization Tax, a "slow but correct" reference implementation is used as an anchor. The framework consists of three steps:

Generate test inputs : the model understands the purpose of the slow code and creates a diverse set of inputs covering edge cases, exceptional inputs, and core logic.

Build a trustworthy test suite : run the slow implementation on the generated inputs to obtain correct outputs, yielding a 100 % reliable input‑output pair set.

Iterative optimization : evaluate the optimized (fast) code against this test suite; if errors appear, feed the feedback to the model for further refinement until the code is both fast and correct.

Slow code is correct; using it as a verification anchor ensures that speed gains do not sacrifice correctness.

Experimental Results

Problem‑Oriented Gains

On the Qwen2.5‑Coder 32B model, the problem‑oriented approach achieves nearly a 2× improvement over the user‑oriented baseline (BEST@1). Even with only 30 % of the original data, performance remains superior, demonstrating strong data efficiency.

Problem‑Oriented performance comparison
Problem‑Oriented performance comparison

Anchor Verification Gains

Combining anchor verification with a DeepSeek‑V3 backbone further boosts performance, yielding a 12.99 % increase in correctness and reaching 78.43 % optimization after five iterations, surpassing direct test‑generation methods.

Anchor verification performance
Anchor verification performance

Conclusion

New perspective : shifting from user‑oriented to problem‑oriented data breaks cognitive inertia and enables true algorithmic optimization.

New framework : anchor verification leverages slow, correct code as a reliable test oracle, effectively mitigating the Optimization Tax.

Empirical gains : experiments show a 71.06 % optimization rate, 6.08× speedup, and 74.54 % correctness, marking a substantial step forward for AI‑driven code optimization.

While achieving 100 % correctness remains an open challenge, the proposed methods significantly advance the capability of large language models to produce faster, reliable code.

Paper link: https://arxiv.org/abs/2406.11935
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI code generationcode optimizationperformance benchmarkinganchor verificationproblem-oriented learning
PaperAgent
Written by

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.