Keeping Image Quality with Only 20 Diffusion Steps: The TC‑Padé Acceleration Method

TC‑Padé uses a Padé‑based residual prediction framework, step‑aware strategies, and a trajectory‑stability indicator to accelerate diffusion sampling to as few as 20 steps while preserving visual fidelity, achieving up to 2.88× speed‑up on image generation and 1.72× on video generation.

Machine Heart
Machine Heart
Machine Heart
Keeping Image Quality with Only 20 Diffusion Steps: The TC‑Padé Acceleration Method

Research Background

Diffusion models have become the dominant approach for multimodal generation, yet their inference cost—requiring dozens to hundreds of denoising steps—remains a bottleneck for industrial deployment. In practice, budgets are limited to 20–30 steps, where existing acceleration techniques (feature caching or polynomial extrapolation) often cause texture artifacts, color drift, or trajectory deviation, especially for safety‑critical data synthesis.

Proposed Method: TC‑Padé

The authors introduce TC‑Padé (Trajectory‑Consistent Padé Approximation), a training‑free, plug‑and‑play acceleration framework built on three innovations:

Residual‑based Padé prediction : Instead of predicting raw features, TC‑Padé predicts the residual between adjacent layer representations, which varies more smoothly over time. Padé approximation replaces Taylor series with a rational function (numerator/denominator polynomials) to better capture nonlinear dynamics in large time gaps. The residual is defined as

Residual definition
Residual definition

and the Padé predictor is expressed as

Padé form
Padé form

.

Step‑aware prediction strategy : TC‑Padé adapts its residual update rule to the denoising stage. Early steps (high noise) avoid aggressive extrapolation, middle steps exploit Padé’s stability, and late steps focus on fine‑grained detail, thereby matching the varying dynamics of diffusion.

Trajectory‑Stability Indicator (TSI) : A metric computed from the predicted trajectory decides whether a step can be skipped (using the prediction) or must be fully computed. When the trajectory is smooth, computation is omitted; otherwise, full denoising is performed, ensuring quality while maximizing speed. The indicator is defined as

TSI formula
TSI formula

.

Experimental Results

Image generation (FLUX.1‑dev) : TC‑Padé (fast) achieves 2.88× inference acceleration while maintaining FID, CLIP Score, PSNR, SSIM, and LPIPS comparable to the original 20‑step baseline and superior to existing cache‑based methods.

Video generation (Wan2.1‑1.3B) : The method yields 1.72× speed‑up and 1.74× FLOPs reduction, with only a minor drop in VBench‑2.0 score and clear gains in PSNR, SSIM, and LPIPS over Taylor‑based predictors.

Class‑conditional image generation (DiT‑XL/2, ImageNet 256×256) : TC‑Padé provides 1.46× latency reduction and 1.64× FLOPs saving, achieving lower FID and better precision‑recall balance than competing cache accelerators.

Ablation Studies

Residual cache granularity: whole‑block caching outperforms double‑stream and single‑stream variants, offering the best quality‑speed trade‑off.

Stability threshold θ: θ=0.7 gives the highest 2.88× speed‑up; θ=1.0 yields a more balanced quality‑efficiency profile.

Compatibility with quantization: combining TC‑Padé with quantization further reduces latency, demonstrating practical deployment potential.

Conclusion

TC‑Padé addresses the core challenge of “fast but unstable” low‑step diffusion inference by integrating Padé‑based residual prediction, step‑aware updates, and an adaptive stability check. It delivers substantial inference acceleration without training overhead and preserves high‑fidelity generation across image, video, and class‑conditional tasks, offering a realistic solution for latency‑sensitive industrial applications.

Inference AccelerationImage Generationlow-step samplingPadé approximationTC-Padé
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.