Tag

video generation

1 views collected around this technical thread.

AntTech
AntTech
Jun 15, 2025 · Artificial Intelligence

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

The Interactive Intelligence Lab of Ant Technology Research Institute presented 21 accepted CVPR 2025 papers covering visual generation, editing, 3D vision, digital humans and multimodal AI, highlighting tools such as MagicQuill, Lumos, Aurora, FLARE, LeviTor, MangaNinja, AniDoc, Mimir, AvatarArtist, DiffListener, MotionStone, TensorialGaussianAvatars, DualTalk, CompreCap and Uni-AD.

CVPR2025Multimodal Modelscomputer vision
0 likes · 20 min read
21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs
Kuaishou Large Model
Kuaishou Large Model
Jun 11, 2025 · Artificial Intelligence

12 Kuaishou Breakthrough Papers at CVPR 2025: Video Generation, Diffusion & Multimodal AI

CVPR 2025 in Nashville will feature 12 Kuaishou papers spanning large‑scale video datasets, quality assessment, 3D/4D reconstruction, controllable generation, diffusion scaling laws, multimodal simulation, and novel benchmarks, highlighting the company's cutting‑edge contributions to video AI research.

computer visiondiffusion modelslarge-scale datasets
0 likes · 21 min read
12 Kuaishou Breakthrough Papers at CVPR 2025: Video Generation, Diffusion & Multimodal AI
Kuaishou Audio & Video Technology
Kuaishou Audio & Video Technology
Jun 11, 2025 · Artificial Intelligence

Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI

Kuaishou presented twelve peer‑reviewed papers at CVPR 2025 covering video quality assessment, large‑scale video datasets, dynamic 3D avatar reconstruction, 4D scene simulation, controllable video generation, scaling laws for diffusion transformers, multimodal foundations, and more, highlighting the company's leading research in computer vision and AI.

AI researchCVPR2025computer vision
0 likes · 21 min read
Kuaishou Showcases 12 Cutting-Edge CVPR 2025 Papers on Video Generation and AI
Kuaishou Tech
Kuaishou Tech
Jun 10, 2025 · Artificial Intelligence

Top 12 Cutting-Edge Video Generation Papers from Kuaishou at CVPR 2025

The article highlights CVPR 2025’s acceptance statistics and showcases twelve cutting‑edge video‑generation papers from Kuaishou, spanning datasets, quality assessment, style control, scaling laws, 4D simulation, interleaved image‑text data, vision‑language acceleration, high‑fidelity avatars, patch‑wise super‑resolution, narrative‑driven benchmarks, sketch‑based editing, and spatio‑temporal diffusion, each with links and abstracts.

CVPR2025Kuaishoucomputer vision
0 likes · 20 min read
Top 12 Cutting-Edge Video Generation Papers from Kuaishou at CVPR 2025
DataFunTalk
DataFunTalk
Jun 8, 2025 · Artificial Intelligence

Why Autoregressive Video Models Like MAGI-1 May Outperform Diffusion Approaches

The article examines the current dominance of diffusion models in commercial video generation, contrasts them with autoregressive methods, and details how the open‑source MAGI‑1 model combines both paradigms to achieve longer, more controllable video synthesis while addressing scalability and quality challenges.

AI researchMAGI-1autoregressive models
0 likes · 70 min read
Why Autoregressive Video Models Like MAGI-1 May Outperform Diffusion Approaches
DataFunTalk
DataFunTalk
Mar 3, 2025 · Artificial Intelligence

FlightVGM: FPGA-Accelerated Inference for Video Generation Models Wins Best Paper at FPGA 2025

The FlightVGM paper, awarded Best Paper at FPGA 2025, details a novel FPGA-based inference IP for video generation models that leverages time‑space activation sparsity, mixed‑precision DSP58 extensions, and adaptive scheduling to achieve up to 1.30× performance and 4.49× energy‑efficiency gains over a NVIDIA 3090 GPU while preserving model accuracy.

AIFPGAMixed Precision
0 likes · 11 min read
FlightVGM: FPGA-Accelerated Inference for Video Generation Models Wins Best Paper at FPGA 2025
DataFunTalk
DataFunTalk
Feb 26, 2025 · Artificial Intelligence

Alibaba Cloud's Wanxiang 2.1: Open‑Source Dual‑Version Visual Generation Model with Full‑Scale Capabilities

Wanxiang 2.1, an open‑source visual generation model released by Alibaba Cloud, offers a 140‑billion‑parameter professional version and a 13‑billion‑parameter consumer‑grade version, delivering SOTA performance across multiple benchmarks, supporting diverse video generation tasks, and employing advanced DiT‑based architecture, 3D VAE, and efficient distributed training strategies.

AI modelOpen-sourcedeep learning
0 likes · 11 min read
Alibaba Cloud's Wanxiang 2.1: Open‑Source Dual‑Version Visual Generation Model with Full‑Scale Capabilities
DaTaobao Tech
DaTaobao Tech
Feb 24, 2025 · Artificial Intelligence

AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications

The article surveys recent AIGC video generation advances for Taobao e‑commerce, detailing lip‑sync models like Wav2Lip and MuseTalk, head‑driven systems such as Hallo and EchoMimic, body‑driven pipelines including AnimateAnyone and Tango, and a four‑stage production workflow that boosts click‑through rates and enables virtual try‑on.

AIGCdeep learninge-commerce
0 likes · 21 min read
AIGC Video Generation Techniques for E‑commerce: Lip‑Sync, Head/Body Driving, and Business Applications
Python Programming Learning Circle
Python Programming Learning Circle
Jan 24, 2025 · Fundamentals

Creating a Cherry Blossom Timelapse with Python: Image Processing and Video Generation

This article demonstrates how to use Python, OpenCV, and Pillow to programmatically generate frames that depict the gradual opening of cherry blossoms, assemble them into a video, and share the result as a timelapse celebrating Wuhan University's spring scenery.

Image ProcessingTutorialcherry-blossom
0 likes · 5 min read
Creating a Cherry Blossom Timelapse with Python: Image Processing and Video Generation
ZhongAn Tech Team
ZhongAn Tech Team
Jan 19, 2025 · Artificial Intelligence

Weekly AI Digest Issue 11: Recommendation Algorithms, Video Generation Advances, and AGI Research

This issue of the weekly AI digest explores Xiaohongshu’s NoteLLM recommendation system, compares Chinese text generation in video AI across major platforms, highlights Alibaba’s Tongyi Wanxiang breakthroughs, discusses Keras founder François Chollet’s new AGI‑focused lab, and reviews Google’s Veo 2 and Imagen‑3 advancements.

AGIAIRecommendation systems
0 likes · 11 min read
Weekly AI Digest Issue 11: Recommendation Algorithms, Video Generation Advances, and AGI Research
php中文网 Courses
php中文网 Courses
Dec 13, 2024 · Artificial Intelligence

OpenAI Releases Sora Video Generation Model: Three Key Implications and Core Features

OpenAI's new Sora model introduces AI-powered video generation, empowering creators, expanding interaction beyond text, and marking a pivotal step toward AGI by enabling machines to understand and produce visual content, with a suite of tools such as Explore, StoryBoard, Remix, Loop, and Blend.

OpenAISoraartificial intelligence
0 likes · 4 min read
OpenAI Releases Sora Video Generation Model: Three Key Implications and Core Features
ZhongAn Tech Team
ZhongAn Tech Team
Nov 16, 2024 · Artificial Intelligence

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

This weekly AI roundup discusses emerging video generation tools like PixelDance and Vidu 1.5, debates on scaling limits of large models, AGI geopolitical considerations, and a MIT study comparing LoRA with full fine‑tuning for domain adaptation.

AGIAIFine-tuning
0 likes · 8 min read
Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning
DataFunSummit
DataFunSummit
Oct 10, 2024 · Artificial Intelligence

AIGC‑Assisted Marketing Material Generation at Shujia Technology

This article describes Shujia Technology's use of artificial intelligence to generate marketing images and videos, outlining the background, challenges of high-volume content production, detailed solutions for image and video assets—including layout models, diffusion models, and digital human synthesis—and future research directions.

AIGCDigital HumanLarge Models
0 likes · 12 min read
AIGC‑Assisted Marketing Material Generation at Shujia Technology
360 Tech Engineering
360 Tech Engineering
Aug 29, 2024 · Artificial Intelligence

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

FancyVideo is an open‑source UNet‑based video generation model that supports arbitrary resolutions, aspect ratios, styles, and motion dynamics by introducing a Cross‑frame Textual Guidance Module (CTGM) with temporal injectors, refiners, and boosters, achieving state‑of‑the‑art results on multiple benchmarks and enabling versatile applications such as video extension, backtracking, and frame interpolation.

AI researchUNetcross-frame guidance
0 likes · 6 min read
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Tencent Advertising Technology
Tencent Advertising Technology
Jul 31, 2024 · Artificial Intelligence

MimicMotion: A Controllable Video Generation Framework for High-Quality Human Motion Synthesis

MimicMotion is a controllable video generation framework that produces smooth, high-quality human motion videos by leveraging skeletal action guidance, addressing challenges in video generation such as limited length, weak controllability, and lack of dynamic detail.

AIAdvertising TechnologyMimicMotion
0 likes · 13 min read
MimicMotion: A Controllable Video Generation Framework for High-Quality Human Motion Synthesis
Qunar Tech Salon
Qunar Tech Salon
Jul 25, 2024 · Artificial Intelligence

AI-Generated Video Practices for International Hotels

At the WOT2024 conference, Qunar Travel’s CTO Zheng Jimin presented a comprehensive overview of AI-generated video production for international hotels, detailing challenges, AI-driven workflow automation, practical implementation steps, multilingual translation enhancements, and performance results, offering valuable insights for scaling high‑quality hotel video content.

AIAIGCAutomation
0 likes · 11 min read
AI-Generated Video Practices for International Hotels
Baidu Geek Talk
Baidu Geek Talk
Jul 24, 2024 · Artificial Intelligence

AI-Driven Fusion of Peking Opera Characters with Ink-Wash Painting Style Using PaddleGAN

Li Yilin’s AI project blends Peking Opera characters with traditional ink‑wash painting by using PaddleHub for style transfer and PaddleGAN’s First‑Order Motion model for facial motion, then adds music and Wav2Lip lip‑sync, producing videos that modernize Chinese heritage and gauge public cultural awareness.

AIPaddleGANPeking Opera
0 likes · 9 min read
AI-Driven Fusion of Peking Opera Characters with Ink-Wash Painting Style Using PaddleGAN
Kuaishou Large Model
Kuaishou Large Model
Jun 27, 2024 · Artificial Intelligence

How I2V-Adapter Turns Images into Videos with Minimal Training

Fast‑forwarding image‑to‑video generation, the article introduces I2V‑Adapter, a lightweight plug‑in for Stable Diffusion‑based video diffusion models that converts a single static image into a coherent video without altering the original T2V architecture, and details its design, frame‑similarity prior, experimental results, and real‑world applications.

AII2V-AdapterStable Diffusion
0 likes · 9 min read
How I2V-Adapter Turns Images into Videos with Minimal Training
Kuaishou Tech
Kuaishou Tech
Jun 26, 2024 · Artificial Intelligence

I2V-Adapter: A Lightweight Image‑to‑Video Adapter for Stable Diffusion Video Diffusion Models

The I2V-Adapter paper introduces a plug‑and‑play lightweight module that enables static images to be converted into dynamic videos using Stable Diffusion‑based text‑to‑video diffusion models without altering the original architecture or pretrained parameters, achieving competitive quality with far less training cost.

AII2V-AdapterStable Diffusion
0 likes · 8 min read
I2V-Adapter: A Lightweight Image‑to‑Video Adapter for Stable Diffusion Video Diffusion Models
DataFunTalk
DataFunTalk
May 3, 2024 · Artificial Intelligence

Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora

This article reviews the rapid progress of text‑to‑video generation, explains diffusion‑based video synthesis, outlines key technical challenges such as motion modeling, semantic alignment and quality, and presents Tencent’s solutions and real‑world applications, while also discussing future directions and the impact of OpenAI’s Sora model.

AISoraTencent
0 likes · 23 min read
Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora