Tagged articles

video diffusion

9 articles · Page 1 of 1

Jun 25, 2026 · Artificial Intelligence

No‑Training Camera Redirection: From One Monocular Video to Arbitrary Angles and Bullet‑Time

FreeOrbit4D achieves training‑free arbitrary camera redirection for a single monocular video by reconstructing a foreground‑complete 4D geometry, delivering stable large‑angle shots, beating baselines on VBench and user studies, and exposing an editable 4D point cloud for many downstream applications.

4D reconstructionFreeOrbit4Dcamera redirection

0 likes · 11 min read

No‑Training Camera Redirection: From One Monocular Video to Arbitrary Angles and Bullet‑Time

Machine Heart

Jun 20, 2026 · Artificial Intelligence

CameraSquad: Precise Camera Control and Multi‑View Consistency for Spatially Intelligent Video Models

CameraSquad introduces a parallel multi‑trajectory video generation framework that delivers precise camera control and cross‑view content consistency, enabling high‑quality 3D point‑cloud reconstruction and superior performance on benchmarks such as WebVid and HumanVid compared with prior camera‑controlled video methods.

3D reconstructionCameraSquadcamera-controlled video

0 likes · 14 min read

CameraSquad: Precise Camera Control and Multi‑View Consistency for Spatially Intelligent Video Models

AIWalker

May 21, 2026 · Artificial Intelligence

AnyFlow: Generate High‑Quality Video in 4 Steps with Unlimited Sampling Improvement

AnyFlow introduces a flow‑map distillation framework that enables video diffusion models to produce high‑quality results in just four steps while continuously improving with additional sampling steps, supporting both causal and bidirectional architectures up to 14 B parameters and allowing downstream fine‑tuning.

AI video generationany-step samplingconsistency distillation

0 likes · 13 min read

AnyFlow: Generate High‑Quality Video in 4 Steps with Unlimited Sampling Improvement

AIWalker

May 20, 2026 · Artificial Intelligence

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

AnyFlow introduces a flow‑map distillation framework that enables video diffusion models to produce high‑quality results in just four sampling steps while still gaining quality as the number of steps increases, supporting both causal and bidirectional architectures and scaling up to 14 B parameters.

On‑Policy Distillationbidirectional videocausal video

0 likes · 14 min read

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

Machine Learning Algorithms & Natural Language Processing

Mar 22, 2026 · Artificial Intelligence

NS-Diff: Adding a Physics Engine to Diffusion Models for Fluid and Rigid‑Body Dynamics

The CVPR 2026 paper introduces NS‑Diff, a physics‑guided video diffusion framework that combines a noise‑robust dynamics detector, a physical‑condition latent injection module, and reinforcement‑learning optimization to reduce jerk error by 43 % and fluid divergence by 33 %, achieving superior physical realism and visual quality across multiple benchmarks.

CVPR 2026NS‑DiffNavier-Stokes

0 likes · 13 min read

NS-Diff: Adding a Physics Engine to Diffusion Models for Fluid and Rigid‑Body Dynamics

Bilibili Tech

Feb 13, 2026 · Artificial Intelligence

Self-Forcing: Turning Global Video Diffusion into Causal Streaming for Long-Form Generation

This article examines the Wan2.1 video diffusion model, identifies its scalability bottlenecks for long and real‑time video generation, and introduces the Self‑Forcing causal framework together with sequence‑parallel and RoPE optimizations that achieve sub‑second latency and up to 1.5× speed‑up on modern GPUs.

GPU Optimizationcausal inferencelarge video generation

0 likes · 14 min read

Self-Forcing: Turning Global Video Diffusion into Causal Streaming for Long-Form Generation

Amap Tech

May 8, 2025 · Artificial Intelligence

FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

FantasyTalking generates high-fidelity, coherent talking portraits from a single static image by employing a two-stage audio-visual alignment—global segment-level motion and frame-level lip refinement—combined with face-centric cross-attention for identity preservation and a motion-intensity module that lets users control expression and body movement, achieving superior realism, synchronization, and performance over prior methods.

Deep Learningaudio-visual alignmentidentity preservation

0 likes · 10 min read

FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis

NewBeeNLP

Mar 7, 2024 · Artificial Intelligence

How Sora is Redefining Large Vision Models: A Deep Dive into Technology, Limits, and Opportunities

This comprehensive review examines Sora, the first model capable of generating minute‑long, high‑quality videos from text, covering its historical background, core diffusion‑Transformer architecture, data preprocessing strategies, prompt engineering techniques, diverse applications, and the ethical and technical limitations that shape its future.

Multimodal AIPrompt engineeringSora

0 likes · 28 min read

How Sora is Redefining Large Vision Models: A Deep Dive into Technology, Limits, and Opportunities

Kuaishou Tech

Oct 31, 2023 · Artificial Intelligence

Kuaishou’s Nine Accepted Papers at ACM MM 2023: Summaries and Links

This article presents concise English summaries of nine Kuaishou research papers accepted at ACM MM 2023, covering topics such as no‑reference video quality assessment, adaptive video quality models, blind image super‑resolution, audio‑visual‑language transfer learning, motion‑aware video diffusion, large‑scale e‑commerce retrieval, and interactive segmentation.

AIaudio-visual languagee-commerce retrieval

0 likes · 18 min read

Kuaishou’s Nine Accepted Papers at ACM MM 2023: Summaries and Links