Artificial Intelligence 6 min read

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

OpenAI’s newly unveiled Sora system can generate 60‑second, high‑quality videos from plain text prompts, leveraging a data‑driven physical engine trained on synthetic data from Unreal Engine 5, with contributions from researchers like Tim Brooks and Bill Peebles, marking a major AI video‑generation breakthrough.

21CTO

Feb 18, 2024

How OpenAI’s Sora Turns Text into Realistic 60‑Second Videos

Sora Debuts

OpenAI announced a new AI product called “Sora” (Japanese for “sky”), which can generate a 60‑second video from a simple text prompt while preserving visual quality.

Sora understands language deeply, accurately follows prompts, and can create convincing characters, complex scenes with multiple roles, specific actions, precise themes, and detailed backgrounds. It can also produce multiple shots within a single video, maintaining consistent character and visual style.

OpenAI shared sample videos to demonstrate the feasibility of text‑to‑video generation, prompting some observers to predict that Hollywood could be disrupted within three years.

According to Nvidia senior research scientist Jim Fan, Sora functions as a data‑driven physical engine likely trained on massive synthetic data from Unreal Engine 5, using denoising and gradient‑based mathematics.

Key Researchers

Tim Brooks – Co‑author of DALL‑E 3, creator of InstructPix2Pix, and former video‑generation lead at Nvidia. He holds a B.S. in Logic & Computation from Carnegie Mellon, worked at Facebook, Google Pixel AI, and Berkeley AI Lab, and joined OpenAI after completing his Ph.D. in 2023.

Bill Peebles – MIT graduate who studied GANs and text‑to‑video, interned at Nvidia’s deep‑learning and autonomous‑driving teams, and collaborated on the DiT (Diffusion Transformer) foundation of Sora. His work was a candidate for the CVPR 2022 Best Paper award.

The team also includes several undergraduate researchers, forming a young and dynamic R&D group.

Industry Reaction

Tim Brooks’s résumé highlights his diverse talents, from National Geographic photography to Broadway performances and beatboxing awards, challenging the stereotype of a single‑focused researcher.

Social media users have posted mixed reactions, sharing screenshots of comments and memes about the technology.

Editor: Da Xiong Compiled by: Quantum Bit

Related Reading

OpenAI Sora Video Generation Model Technical Report

OpenAI Developing Web Search Product with Microsoft Bing Support

OpenAI Announces DALL‑E 3 Image Generator Will Include Watermarks

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deep learning OpenAI text-to-video Generative AI video synthesis

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.