Tagged articles
28 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 8, 2026 · Artificial Intelligence

HappyHorse 1.0 Tops AI Video Ranking, Leaving Seedance 2.0 74 Points Behind

A mysterious model dubbed HappyHorse‑1.0 surged to the top of the Artificial Analysis video‑AI leaderboard with an Elo score of 1347, outpacing the previously dominant Seedance 2.0 by 74 points, sparking intense community debate over its origin and scoring methodology.

AI-leaderboardHappyHorseSeedance
0 likes · 7 min read
HappyHorse 1.0 Tops AI Video Ranking, Leaving Seedance 2.0 74 Points Behind
HyperAI Super Neural
HyperAI Super Neural
Nov 25, 2025 · Artificial Intelligence

LongCat‑Video: Meituan’s Model for Text‑to‑Video, Image‑to‑Video & Continuation

LongCat‑Video, an open‑source video generation model from Meituan, adopts a unified multi‑task architecture to handle text‑to‑video, image‑to‑video and video‑continuation, delivers minute‑long high‑quality clips with coarse‑to‑fine inference, achieves benchmark scores comparable to leading models like Wan2.2, and provides a one‑click deployment tutorial on HyperAI.

BenchmarkLongCat-VideoMeituan
0 likes · 6 min read
LongCat‑Video: Meituan’s Model for Text‑to‑Video, Image‑to‑Video & Continuation
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 18, 2025 · Artificial Intelligence

How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images

The open‑sourced 14‑billion‑parameter Tongyi Wanxiang video model can create high‑quality 720p videos that seamlessly connect user‑provided start and end images, offering controllable, personalized video generation with prompt‑driven camera motions and easy access via its website, GitHub, Hugging Face, and ModelScope.

AI modelComputer VisionDeep Learning
0 likes · 5 min read
How the New 14B End‑to‑End Video Model Generates Custom 720p Clips from Two Images
AIWalker
AIWalker
Apr 6, 2025 · Artificial Intelligence

NOVA: Redefining Autoregressive Visual Modeling Without Vector Quantization

NOVA introduces a highly efficient autoregressive video generation framework that eliminates vector quantization, combines frame‑by‑frame causal prediction with set‑by‑set spatial attention, and achieves state‑of‑the‑art quality on VBench and GenEval while offering strong zero‑shot generalization across text‑to‑image and text‑to‑video tasks.

Benchmark resultsNOVAautoregressive video generation
0 likes · 14 min read
NOVA: Redefining Autoregressive Visual Modeling Without Vector Quantization
Bilibili Tech
Bilibili Tech
Mar 4, 2025 · Artificial Intelligence

Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team

The Bilibili TTV team optimized OpenSora and CogVideoX text‑to‑video models by redesigning data storage with Alluxio, parallelizing VAE encoding, applying dynamic sequence‑parallel and DeepSpeed‑Ulysses attention, adapting GPU code for NPU execution, leveraging profiling‑driven kernel fusion, FlashAttention, and expandable memory to dramatically increase training efficiency and frame throughput, while outlining future pipeline‑parallel and ZeRO‑3 scaling plans.

Diffusion TransformerFlashAttentionModel Parallelism
0 likes · 26 min read
Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 18, 2025 · Artificial Intelligence

One-Click Deployment of Cutting-Edge Text-to-Video and Voice Interaction Models

This article introduces the state‑of‑the‑art Step‑Video‑T2V text‑to‑video model and the Step‑Audio‑Chat voice interaction model, outlines their technical specifications and benchmark results, and provides a detailed step‑by‑step guide for deploying both models with a single click using Alibaba Cloud's PAI Model Gallery.

AI Model DeploymentPAI Model Gallerystate-of-the-art
0 likes · 9 min read
One-Click Deployment of Cutting-Edge Text-to-Video and Voice Interaction Models
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 17, 2024 · Artificial Intelligence

How Meta’s Movie Gen Pushes Text‑to‑Video Generation to New Heights

Meta’s newly released 92‑page Movie Gen paper introduces a multimodal LLM that unifies text‑to‑image, text‑to‑video, personalized video, precise video editing, and audio generation, detailing its dual‑model architecture, training pipeline, temporal auto‑encoder design, scaling strategies, evaluation benchmark, and ablation studies.

Deep LearningModel ScalingVideo Generation
0 likes · 34 min read
How Meta’s Movie Gen Pushes Text‑to‑Video Generation to New Heights
DataFunTalk
DataFunTalk
May 3, 2024 · Artificial Intelligence

Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora

This article reviews the rapid progress of text‑to‑video generation, explains diffusion‑based video synthesis, outlines key technical challenges such as motion modeling, semantic alignment and quality, and presents Tencent’s solutions and real‑world applications, while also discussing future directions and the impact of OpenAI’s Sora model.

AISoraTencent
0 likes · 23 min read
Advances, Challenges, and Industrial Practices in Text‑to‑Video Generation – From Diffusion Models to Sora
21CTO
21CTO
Apr 17, 2024 · Artificial Intelligence

How Sora Generates High‑Quality Text‑to‑Video: A Deep Dive into Its Architecture

This article breaks down OpenAI's Sora text‑to‑video model, exploring its overall structure, visual encoder‑decoder, Spacetime Latent Patch, transformer‑based diffusion, long‑time consistency strategies, training techniques, and the technical choices that enable variable resolution, aspect ratios, and up to 60‑second video generation.

AI video generationLatent DiffusionSora
0 likes · 50 min read
How Sora Generates High‑Quality Text‑to‑Video: A Deep Dive into Its Architecture
Open Source Linux
Open Source Linux
Apr 16, 2024 · Artificial Intelligence

How Sora’s Text-to-Video Model Is Redefining AI‑Generated Video

Sora, a new text‑to‑video AI model, can create one‑minute videos from textual prompts or static images, delivering industry‑leading fidelity, resolution, and coherent motion by using spatial‑temporal patches inspired by ViViT, and shows emergent capabilities that hint at universal physical simulation.

Multimodal AISora modelViViT
0 likes · 4 min read
How Sora’s Text-to-Video Model Is Redefining AI‑Generated Video
Architects' Tech Alliance
Architects' Tech Alliance
Apr 7, 2024 · Artificial Intelligence

How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model

Sora, the newly announced text‑to‑video large model, can generate one‑minute high‑fidelity videos from textual prompts or static images, handling complex scenes, expressive characters, and sophisticated camera motions while also supporting video extension and frame‑filling, positioning it at the forefront of multimodal AI research.

AI modelSoraVideo Generation
0 likes · 6 min read
How Sora Is Redefining Text‑to‑Video Generation: Inside the New AI Model
Alipay Experience Technology
Alipay Experience Technology
Mar 28, 2024 · Artificial Intelligence

How OpenAI’s Sora Revolutionizes Text‑to‑Video Generation: Capabilities & Comparisons

This article introduces OpenAI’s Sora video‑generation model, compares it with other leading solutions, explains its underlying diffusion‑based architecture, showcases sample outputs, outlines its diverse generation abilities, and discusses current limitations and future implications for AI‑driven video creation.

AI video generationOpenAISora
0 likes · 13 min read
How OpenAI’s Sora Revolutionizes Text‑to‑Video Generation: Capabilities & Comparisons
DevOps
DevOps
Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora
0 likes · 8 min read
OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model
DaTaobao Tech
DaTaobao Tech
Mar 25, 2024 · Artificial Intelligence

Survey of AIGC Video Generation Algorithms

Since 2023, AI‑generated video research has expanded across six algorithmic categories—text‑to‑video, image‑to‑video, editing, style transfer, human motion, and long‑video generation—highlighting works such as CogVideo, Imagen Video, MagicVideo, ControlVideo, DCTNet, NUWA‑XL and OpenAI’s Sora, while analysis shows short‑clip diffusion models excel, editing remains costly, style transfer is efficient, and truly long, temporally consistent videos remain an open challenge.

AIAIGCVideo Editing
0 likes · 13 min read
Survey of AIGC Video Generation Algorithms
NewBeeNLP
NewBeeNLP
Mar 22, 2024 · Artificial Intelligence

Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine

This article provides a step‑by‑step technical analysis of OpenAI’s Sora model, examining its possible overall architecture, video encoder‑decoder design, Spacetime Latent Patch mechanism, transformer‑based diffusion process, training strategies, and long‑term consistency techniques, while grounding each speculation in publicly available reports and related research.

AI analysisSoraTransformer
0 likes · 50 min read
Unraveling Sora: How OpenAI Might Build Its Text‑to‑Video Engine
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 22, 2024 · Artificial Intelligence

Unveiling Sora: How OpenAI Might Build Its Groundbreaking Text‑to‑Video Model

This article provides a detailed, step‑by‑step technical analysis of OpenAI's Sora text‑to‑video system, exploring its overall architecture, visual encoder‑decoder choices, Spacetime Latent Patch design, transformer‑based diffusion model, training strategies, and long‑time consistency mechanisms while referencing relevant research papers and open‑source techniques.

AISoradiffusion
0 likes · 50 min read
Unveiling Sora: How OpenAI Might Build Its Groundbreaking Text‑to‑Video Model
58UXD
58UXD
Feb 27, 2024 · Artificial Intelligence

How OpenAI’s Sora Is Redefining AI‑Generated Video Creation

OpenAI’s newly released Sora model, built on the DALL‑E 3 foundation, can generate up to 60‑second high‑quality videos from text prompts, offering features such as multi‑character scenes, seamless video synthesis, image‑to‑video animation, physical world simulation, and prompting guidance for designers, while raising ethical and creative challenges.

AI video generationDesignOpenAI
0 likes · 9 min read
How OpenAI’s Sora Is Redefining AI‑Generated Video Creation
Architect
Architect
Feb 22, 2024 · Artificial Intelligence

Sora: OpenAI’s Text‑to‑Video Model – Principles, Impact, and Outlook

The article provides a comprehensive technical overview of OpenAI’s Sora text‑to‑video model, explaining its background, underlying diffusion‑Transformer architecture, key breakthroughs, potential industry impacts, success factors, limitations, and future prospects for AI‑generated video content.

AIOpenAISora
0 likes · 15 min read
Sora: OpenAI’s Text‑to‑Video Model – Principles, Impact, and Outlook
High Availability Architecture
High Availability Architecture
Feb 22, 2024 · Artificial Intelligence

Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model

OpenAI’s newly released Sora text‑to‑video model demonstrates unprecedented high‑resolution, long‑duration video generation by encoding videos into latent space, applying diffusion with a transformer conditioned on text, and decoding back to pixels, marking a major leap in AI video synthesis and its potential applications.

AI video generationLatent DiffusionSora
0 likes · 14 min read
Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model
Architects' Tech Alliance
Architects' Tech Alliance
Feb 22, 2024 · Artificial Intelligence

OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights

OpenAI’s Sora model demonstrates unprecedented text‑to‑video generation with up to 60‑second high‑fidelity clips, consistent multi‑character scenes, multi‑camera motion, and world‑simulation abilities, backed by a diffusion‑transformer trained on compressed latent video patches and detailed technical analysis from its accompanying research paper.

AI video generationOpenAISora
0 likes · 11 min read
OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2024 · Artificial Intelligence

OpenAI Sora: Technical Principles and Industry Impact Analysis

OpenAI’s Sora, a text‑to‑video model released during Chinese New Year, combines a VAE encoder, latent diffusion with a DiT transformer, and a VAE decoder to generate videos from prompts, supporting flexible durations and resolutions, language understanding, and uses in creation, editing, and entertainment, though it struggles with physical consistency and long‑term coherence, and its debut is reshaping short‑form video, digital‑human, gaming, and graphics industries.

AI video generationLatent DiffusionOpenAI
0 likes · 14 min read
OpenAI Sora: Technical Principles and Industry Impact Analysis
DevOps
DevOps
Feb 18, 2024 · Artificial Intelligence

OpenAI's Sora: In‑Depth Analysis of the First Text‑to‑Video Model and Its Technical Foundations

OpenAI's Sora, the first text‑to‑video model, demonstrates unprecedented video quality and length by leveraging massive high‑quality training data, novel video‑patch representations, diffusion‑based transformer architecture, and precise subtitle generation, reshaping both AI research and media production.

OpenAISoradiffusion model
0 likes · 9 min read
OpenAI's Sora: In‑Depth Analysis of the First Text‑to‑Video Model and Its Technical Foundations
21CTO
21CTO
Feb 18, 2024 · Artificial Intelligence

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

OpenAI’s newly unveiled Sora system can generate 60‑second, high‑quality videos from plain text prompts, leveraging a data‑driven physical engine trained on synthetic data from Unreal Engine 5, with contributions from researchers like Tim Brooks and Bill Peebles, marking a major AI video‑generation breakthrough.

Deep LearningOpenAIgenerative AI
0 likes · 6 min read
How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos
Architect
Architect
Feb 16, 2024 · Artificial Intelligence

Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review

OpenAI’s newly unveiled Sora model transforms short text prompts into up‑to‑one‑minute high‑definition videos, showcasing advanced diffusion‑Transformer architecture, improved occlusion handling, and detailed visual fidelity, while the article examines its technical breakthroughs, compares it to earlier models, and discusses emerging safety and misuse concerns.

AI SafetyOpenAISora
0 likes · 12 min read
Can OpenAI’s Sora Redefine Text‑to‑Video Generation? An In‑Depth Technical Review
Tencent Cloud Developer
Tencent Cloud Developer
Nov 1, 2022 · Artificial Intelligence

The Rise of AI-Generated Content: Technologies, Applications, and Risks

The article surveys the evolution of AI‑generated content from early art programs to modern diffusion‑based text‑to‑image and text‑to‑video models, outlines key milestones such as Stable Diffusion and DALL‑E 2, explores gaming applications, and highlights limitations, ethical concerns, and copyright risks of open‑source generative AI.

AI Generationcreative AItext-to-image
0 likes · 22 min read
The Rise of AI-Generated Content: Technologies, Applications, and Risks