Tag

text-to-video

0 views collected around this technical thread.

Kuaishou Tech
Kuaishou Tech
May 26, 2025 · Artificial Intelligence

CineMaster: A 3D‑Aware and Controllable Framework for Cinematic Text‑to‑Video Generation

Researchers introduce CineMaster, a SIGGRAPH‑2025 paper presenting a 3D‑aware, controllable text‑to‑video generation framework that lets users define target objects and camera motions via an interactive workflow, enabling cinematic video creation with high‑quality, user‑directed results.

3D-awareAI videoCineMaster
0 likes · 6 min read
CineMaster: A 3D‑Aware and Controllable Framework for Cinematic Text‑to‑Video Generation
Bilibili Tech
Bilibili Tech
Mar 4, 2025 · Artificial Intelligence

Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team

The Bilibili TTV team optimized OpenSora and CogVideoX text‑to‑video models by redesigning data storage with Alluxio, parallelizing VAE encoding, applying dynamic sequence‑parallel and DeepSpeed‑Ulysses attention, adapting GPU code for NPU execution, leveraging profiling‑driven kernel fusion, FlashAttention, and expandable memory to dramatically increase training efficiency and frame throughput, while outlining future pipeline‑parallel and ZeRO‑3 scaling plans.

Diffusion TransformerFlashAttentionNPU
0 likes · 26 min read
Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team
Refining Core Development Skills
Refining Core Development Skills
Aug 8, 2024 · Artificial Intelligence

Getting Started with CodeVideoX API for Text‑to‑Video Generation Using Diffusion Transformers

This guide introduces CodeVideoX, a diffusion‑transformer based video generation model, explains its training and inference pipelines, and provides step‑by‑step instructions with API endpoints, required parameters, and example cURL commands for creating short AI‑generated videos.

AIGCAPICodeVideoX
0 likes · 8 min read
Getting Started with CodeVideoX API for Text‑to‑Video Generation Using Diffusion Transformers
DataFunTalk
DataFunTalk
May 3, 2024 · Artificial Intelligence

Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora

This article reviews the rapid progress of text‑to‑video generation, explains diffusion‑based video synthesis, outlines key technical challenges such as motion modeling, semantic alignment and quality, and presents Tencent’s solutions and real‑world applications, while also discussing future directions and the impact of OpenAI’s Sora model.

AISoraTencent
0 likes · 23 min read
Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora
DevOps
DevOps
Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora
0 likes · 8 min read
OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model
DaTaobao Tech
DaTaobao Tech
Mar 25, 2024 · Artificial Intelligence

Survey of AIGC Video Generation Algorithms

Since 2023, AI‑generated video research has expanded across six algorithmic categories—text‑to‑video, image‑to‑video, editing, style transfer, human motion, and long‑video generation—highlighting works such as CogVideo, Imagen Video, MagicVideo, ControlVideo, DCTNet, NUWA‑XL and OpenAI’s Sora, while analysis shows short‑clip diffusion models excel, editing remains costly, style transfer is efficient, and truly long, temporally consistent videos remain an open challenge.

AIAIGCVideo Editing
0 likes · 13 min read
Survey of AIGC Video Generation Algorithms
Architect
Architect
Feb 22, 2024 · Artificial Intelligence

Sora: OpenAI’s Text‑to‑Video Model – Principles, Impact, and Outlook

The article provides a comprehensive technical overview of OpenAI’s Sora text‑to‑video model, explaining its background, underlying diffusion‑Transformer architecture, key breakthroughs, potential industry impacts, success factors, limitations, and future prospects for AI‑generated video content.

AIOpenAISora
0 likes · 15 min read
Sora: OpenAI’s Text‑to‑Video Model – Principles, Impact, and Outlook
High Availability Architecture
High Availability Architecture
Feb 22, 2024 · Artificial Intelligence

Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model

OpenAI’s newly released Sora text‑to‑video model demonstrates unprecedented high‑resolution, long‑duration video generation by encoding videos into latent space, applying diffusion with a transformer conditioned on text, and decoding back to pixels, marking a major leap in AI video synthesis and its potential applications.

AI video generationSoraTransformer
0 likes · 14 min read
Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model
Architects' Tech Alliance
Architects' Tech Alliance
Feb 22, 2024 · Artificial Intelligence

OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights

OpenAI’s Sora model demonstrates unprecedented text‑to‑video generation with up to 60‑second high‑fidelity clips, consistent multi‑character scenes, multi‑camera motion, and world‑simulation abilities, backed by a diffusion‑transformer trained on compressed latent video patches and detailed technical analysis from its accompanying research paper.

AI video generationArtificial IntelligenceOpenAI
0 likes · 11 min read
OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2024 · Artificial Intelligence

OpenAI Sora: Technical Principles and Industry Impact Analysis

OpenAI’s Sora, a text‑to‑video model released during Chinese New Year, combines a VAE encoder, latent diffusion with a DiT transformer, and a VAE decoder to generate videos from prompts, supporting flexible durations and resolutions, language understanding, and uses in creation, editing, and entertainment, though it struggles with physical consistency and long‑term coherence, and its debut is reshaping short‑form video, digital‑human, gaming, and graphics industries.

AI video generationDiffusion TransformerOpenAI
0 likes · 14 min read
OpenAI Sora: Technical Principles and Industry Impact Analysis
DevOps
DevOps
Feb 18, 2024 · Artificial Intelligence

OpenAI's Sora: In‑Depth Analysis of the First Text‑to‑Video Model and Its Technical Foundations

OpenAI's Sora, the first text‑to‑video model, demonstrates unprecedented video quality and length by leveraging massive high‑quality training data, novel video‑patch representations, diffusion‑based transformer architecture, and precise subtitle generation, reshaping both AI research and media production.

OpenAISoradiffusion model
0 likes · 9 min read
OpenAI's Sora: In‑Depth Analysis of the First Text‑to‑Video Model and Its Technical Foundations
Tencent Cloud Developer
Tencent Cloud Developer
Nov 1, 2022 · Artificial Intelligence

The Rise of AI-Generated Content: Technologies, Applications, and Risks

The article surveys the evolution of AI‑generated content from early art programs to modern diffusion‑based text‑to‑image and text‑to‑video models, outlines key milestones such as Stable Diffusion and DALL‑E 2, explores gaming applications, and highlights limitations, ethical concerns, and copyright risks of open‑source generative AI.

AI generationCreative AIdiffusion models
0 likes · 22 min read
The Rise of AI-Generated Content: Technologies, Applications, and Risks