Artificial Intelligence 13 min read

What’s New in Generative AI? VASA‑1, Llama‑3, Stable Diffusion 3 & More

The article reviews the latest breakthroughs in generative AI, including Microsoft’s VASA‑1 video synthesis model, Meta’s open‑source Llama‑3 large language model, Stability AI’s Stable Diffusion 3 API, Adobe’s integration of third‑party AI video tools into Premiere Pro, and a free image‑style‑recreation platform from Freepik, highlighting their technical details and potential applications.

Xiaohe Frontend Team

Apr 21, 2024

What’s New in Generative AI? VASA‑1, Llama‑3, Stable Diffusion 3 & More

Highlights

Microsoft Asia Research released VASA‑1, a model that creates talking, dynamic human videos from a single image and an audio clip. The demo shows 45 fps 512×512 video generation, with latency around 170 ms on a single NVIDIA RTX 4090 GPU. The paper is available at https://arxiv.org/abs/2404.10667.

Earlier this year Alibaba’s EMO model demonstrated similar capabilities, and subsequent open‑source projects such as EMAGE, AniPortrait, and Google’s VLOGGER have followed suit, indicating strong interest and broad application space for synthetic human video.

Meta announced the official release of Llama‑3 on April 19. Two parameter sizes are offered (8 B and 70 B), with a 128 K token vocabulary and training on over 15 T tokens—seven times the data used for Llama‑2. Llama‑3 improves reasoning, mathematics, code generation, and instruction following, and introduces grouped‑query attention and mask techniques to reduce compute cost.

Stability AI made Stable Diffusion 3 (SD3) and SD3 Turbo available via API, featuring higher image quality, better text‑in‑image handling, and multimodal capabilities through MM‑DiT and Flow Matching. MM‑DiT combines transformer self‑attention with multimodal conditioning, while Flow Matching trains a rectified flow model for efficient noise‑to‑image conversion.

Adobe announced that Premiere Pro will integrate third‑party generative video models—including OpenAI’s Sora, Runway, and Pika—alongside its own Firefly suite, bringing AI‑driven video creation and audio editing to the platform.

Freepik launched a free “Reimagine” service that lets users upload an image and instantly apply styles such as 3D, cartoon, or cyber‑punk, supporting unlimited iterative refinements.

These developments illustrate rapid progress in AI‑generated media, expanding both creative possibilities and practical tools for developers and content creators.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI tools large language models Diffusion Models Generative AI video synthesis

Written by

Xiaohe Frontend Team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.