MeanCache Sets New Multi‑Modal Generation Inference Speed Benchmark at ICLR 2026

MeanCache introduces an average‑velocity‑driven caching framework that uses Jacobian‑vector‑product correction and a multigraph‑based scheduling algorithm to achieve over 4× speedup on state‑of‑the‑art multimodal diffusion models while preserving image fidelity and semantic consistency.

Machine Heart
Machine Heart
Machine Heart
MeanCache Sets New Multi‑Modal Generation Inference Speed Benchmark at ICLR 2026

Industrial‑scale multimodal generation models such as FLUX and Qwen‑Image suffer from slow inference, and traditional feature‑caching methods often cause trajectory drift due to abrupt instantaneous‑speed fluctuations.

Building on the earlier LeMiCa work, the Unicom AI research team and Nanjing University propose MeanCache , a lightweight, training‑free flow‑matching acceleration framework. The key innovation is shifting the caching perspective from instantaneous speed to average velocity . MeanCache captures Jacobian‑vector‑product (JVP) information from the previous timestep and uses a derived anchor identity to precisely correct the current instantaneous speed, thereby stabilizing the generation trajectory.

The framework models the inference process as a multigraph where each timestep is a node and the bias between predicted average velocity and ground truth defines edge weights. A Peak‑Suppressed Shortest Path algorithm computes the optimal caching policy under a given compute budget, effectively determining when to cache.

Experimental results show that MeanCache delivers up to 4× acceleration on commercial‑grade text‑to‑image models Qwen‑Image and FLUX.1 [dev] while achieving state‑of‑the‑art scores on Image Reward and perception metrics. On the video generation model HunyuanVideo, it achieves a 3.6× speedup with improved quality. Qualitative analysis indicates better content consistency as acceleration increases, and the method shows stronger semantic robustness on rare‑word prompts such as “Peristeronic”.

The paper, code, and project page are all open‑source (arXiv:2601.19961, https://github.com/UnicomAI/MeanCache). MeanCache has also been endorsed by the Z‑Image and Qwen‑Image‑2512 teams and integrated into the ComfyUI ecosystem.

In summary, MeanCache provides a novel average‑velocity caching paradigm and a stable scheduling strategy that significantly speeds up diffusion‑based multimodal generation without sacrificing fidelity, offering a practical path toward reducing compute costs for industrial applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ICLR2026Average Velocity CachingDiffusion Model AccelerationJVP CorrectionMeanCacheMultigraph Scheduling
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.