VMBench: Perception-Aligned Motion Benchmark & LD‑RPS Zero‑Shot Restoration
This article introduces VMBench, the first perception‑aligned video motion generation benchmark that defines a five‑dimensional metric suite and a meta‑guided prompt generation pipeline, and presents LD‑RPS, a zero‑shot unified image restoration framework based on latent diffusion recurrent posterior sampling, together with extensive experiments validating both systems.
Conference Overview
ICCV (International Conference on Computer Vision) is a top‑tier international conference in computer vision, with a 24% acceptance rate; five papers from the Gaode technology team were accepted.
VMBench: Perception‑Aligned Video Motion Benchmark
VMBench is the first benchmark that aligns video motion quality evaluation with human perception. It builds a five‑dimensional Perception‑Aligned Motion Metrics (PMM) system—CAS, MSS, OIS, PAS, and TCS—covering six natural motion patterns and provides a large meta‑information‑guided prompt generation (MMPG) framework.
Research Background
Existing evaluation methods suffer from two main issues: (1) metrics are detached from human perception, failing to capture smoothness, physical plausibility, and object integrity; (2) prompt libraries are limited, restricting the assessment of diverse dynamic scenes.
Paper Highlights
Perception‑Aligned Metric Suite (PMM) : includes Common‑sense Alignment Score (CAS), Motion Smoothness Score (MSS), Object Integrity Score (OIS), Perceptible Amplitude Score (PAS), and Temporal Consistency Score (TCS).
Meta‑Guided Motion Prompt Generation (MMPG) : extracts subject, place, and action triples from multiple video datasets, optimizes prompts with large language models, and validates them through human‑LLM collaboration, yielding 1,050 high‑quality prompts.
Experimental Results
Human Perception Alignment : Spearman correlation analysis with 1,200 expert‑rated videos shows PMM outperforms rule‑based and multimodal large‑model baselines across all dimensions.
Ablation Studies : Removing any PMM component degrades overall accuracy, with CAS removal causing the largest drop, confirming its central role.
Qualitative Analysis : Evaluation of six state‑of‑the‑art video generation models reveals distinct strengths and weaknesses per metric, guiding future model improvements.
LD‑RPS: Zero‑Shot Unified Image Restoration
LD‑RPS introduces a latent diffusion recurrent posterior sampling framework that restores images without any training data. It leverages multimodal large language models to generate semantic prompts from degraded inputs and employs a Feature‑Pixel Alignment Module (F‑PAM) to align intermediate diffusion states with the degraded image, enabling unsupervised, zero‑sample restoration.
Key Contributions
Zero‑shot multimodal image restoration using only the degraded image as condition.
Unsupervised Feature‑Pixel Alignment Module (F‑PAM) to bridge latent‑space and pixel‑space gaps.
Recurrent posterior sampling strategy for progressive quality improvement.
Experimental Evaluation
Low‑Light Enhancement : LD‑RPS achieves the best results among posterior‑sampling methods and matches top single‑task approaches on LO Lv1/Lv2 datasets.
Image Dehazing : On the RESIDE HSTS subset, LD‑RPS surpasses all zero‑shot methods in PSNR.
Image Denoising : LD‑RPS consistently outperforms baselines across all metrics.
Image Colorization : LD‑RPS produces vivid, high‑contrast colorized images, whereas baseline methods retain gray tones.
Conclusion and Outlook
VMBench provides a standardized, perception‑aligned evaluation framework for video motion generation, while LD‑RPS demonstrates a powerful zero‑shot approach for unified image restoration. Both contributions advance the field toward more human‑aligned generation and versatile, data‑free restoration techniques.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Amap Tech
Official Amap technology account showcasing all of Amap's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
