Tagged articles
15 articles
Page 1 of 1
Weekly Large Model Application
Weekly Large Model Application
Mar 30, 2026 · Artificial Intelligence

Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More

Kimi-Audio, a general‑purpose audio foundation model from Moonshot AI, integrates ASR, audio QA, automatic audio captioning, emotion classification and end‑to‑end speech dialogue within a single framework, detailing its mixed‑audio input, MiMo‑Transformer core, efficient synthesis pipeline, architectural strengths, limitations, and suitable application scenarios.

ASRAudio LLMBigVGAN
0 likes · 9 min read
Inside Kimi-Audio: A Unified Large Audio Model Covering ASR, AQA, TTS and More
Weekly Large Model Application
Weekly Large Model Application
Mar 20, 2026 · Artificial Intelligence

Inside GLM-4-Voice: An End-to-End Chinese-English Speech Dialogue Model

GLM-4-Voice is an end-to-end Chinese-English speech dialogue model that aligns discrete speech tokens with GLM-4-9B, uses VQ-based tokenization at 12.5 token/s, supports emotion, tone, speed and dialect control, and offers streaming inference with low latency, while detailing its architecture, advantages, limitations and suitable use cases.

GLM-4-VoiceMultimodal AIflow matching
0 likes · 10 min read
Inside GLM-4-Voice: An End-to-End Chinese-English Speech Dialogue Model
SuanNi
SuanNi
Mar 3, 2026 · Artificial Intelligence

How OmniXtreme Breaks the High‑Dynamic Control Barrier for Humanoid Robots

The OmniXtreme architecture introduces a two‑stage flow‑matching and actuation‑aware post‑training framework that enables humanoid robots to reliably execute high‑dynamic, extreme motions in the real world by overcoming simulation scalability limits and physical hardware constraints.

OmniXtremeflow matchinghigh-dynamic control
0 likes · 16 min read
How OmniXtreme Breaks the High‑Dynamic Control Barrier for Humanoid Robots
Data Party THU
Data Party THU
Jan 26, 2026 · Artificial Intelligence

How PropMolFlow Boosts Property‑Guided Molecule Generation by Tenfold

PropMolFlow, a new flow‑matching model introduced by researchers from the University of Florida and NYU, generates property‑guided molecules up to ten times faster than prior SOTA methods while preserving chemical validity and achieving superior performance on benchmarks such as QM9.

AI drug discoveryPropMolFlowcomputational chemistry
0 likes · 7 min read
How PropMolFlow Boosts Property‑Guided Molecule Generation by Tenfold
Kuaishou Tech
Kuaishou Tech
Nov 25, 2025 · Artificial Intelligence

How Flow‑GRPO Boosts Image Generation Accuracy to 95% with Online Reinforcement Learning

Flow‑GRPO introduces online reinforcement learning into flow‑matching models by converting deterministic ODE sampling to stochastic SDE sampling and reducing denoising steps, raising SD‑3.5‑Medium's GenEval accuracy from 63% to 95%—surpassing GPT‑4o—and demonstrating strong gains in complex composition, text rendering, and human‑preference alignment across multiple generative tasks.

AI researchDeep LearningImage Generation
0 likes · 8 min read
How Flow‑GRPO Boosts Image Generation Accuracy to 95% with Online Reinforcement Learning
Kuaishou Tech
Kuaishou Tech
Nov 14, 2025 · Artificial Intelligence

How GRPO‑Guard Stops Over‑Optimization in Flow‑Based Visual Generators

This article explains the over‑optimization problem in GRPO‑based flow models, analyzes why importance‑ratio clipping fails, and introduces GRPO‑Guard with RatioNorm and cross‑step gradient balancing, showing through extensive experiments that it stabilizes training and improves image quality across multiple diffusion backbones and tasks.

GRPO-GuardImage GenerationReinforcement Learning
0 likes · 9 min read
How GRPO‑Guard Stops Over‑Optimization in Flow‑Based Visual Generators
AI Algorithm Path
AI Algorithm Path
Oct 20, 2025 · Artificial Intelligence

Building a Flow Matching Model from Scratch: Complete Code Walkthrough

This article walks through the full implementation of a flow‑matching generative model in PyTorch, covering dataset creation, a small MLP that learns a time‑dependent velocity field, the flow‑matching loss, training loop, ODE‑based sampling, visualisation of the learned vector field, and a discussion of the method's limitations and possible extensions.

Generative ModelsMLPPyTorch
0 likes · 13 min read
Building a Flow Matching Model from Scratch: Complete Code Walkthrough
AI Algorithm Path
AI Algorithm Path
Oct 15, 2025 · Artificial Intelligence

Building a Flow Matching Model from Scratch: Theory Explained

This article walks through the theory behind flow‑matching generative models, contrasting them with diffusion models, detailing the velocity‑field formulation, training objective, and sampling procedure, and includes visual illustrations of the core concepts.

Diffusion ModelsGenerative ModelsODE
0 likes · 8 min read
Building a Flow Matching Model from Scratch: Theory Explained
AI Algorithm Path
AI Algorithm Path
Oct 12, 2025 · Artificial Intelligence

Flow Matching vs Diffusion Models: Key Differences and Connections

This technical article provides a comprehensive comparison of diffusion models and flow matching, covering their intuitive explanations, underlying mathematics, training objectives, sampling efficiency, theoretical guarantees, practical examples, and code implementations to illustrate how each generative approach works.

Diffusion Modelsflow matchinggenerative AI
0 likes · 12 min read
Flow Matching vs Diffusion Models: Key Differences and Connections
HyperAI Super Neural
HyperAI Super Neural
Oct 11, 2025 · Artificial Intelligence

Apple’s Flow‑Matching SimpleFold Slashes Compute Cost While Matching AlphaFold2 Accuracy

Apple’s newly released SimpleFold model leverages flow‑matching and a pure Transformer architecture to eliminate costly MSA and triangular updates, achieving performance comparable to AlphaFold2 and RoseTTAFold2 on CAMEO22 and CASP14 benchmarks while dramatically reducing computational requirements, and a step‑by‑step tutorial lets users run it on HyperAI’s platform.

AI modelHyperAISimpleFold
0 likes · 4 min read
Apple’s Flow‑Matching SimpleFold Slashes Compute Cost While Matching AlphaFold2 Accuracy
AI Frontier Lectures
AI Frontier Lectures
May 27, 2025 · Artificial Intelligence

Can One-Step Generative Modeling Beat Multi-Step Diffusion? Inside MeanFlow

The article presents MeanFlow, a novel one‑step generative modeling framework that replaces instantaneous velocity with an average‑velocity field, achieving a record‑low FID of 3.43 on ImageNet 256×256 with a single function evaluation and outperforming both prior single‑step and multi‑step diffusion models.

AI researchFIDImageNet
0 likes · 7 min read
Can One-Step Generative Modeling Beat Multi-Step Diffusion? Inside MeanFlow
DaTaobao Tech
DaTaobao Tech
Apr 7, 2025 · Artificial Intelligence

Flow Matching for Generative Modeling

Flow Matching reformulates generative modeling by learning a time‑dependent vector field that deterministically transports Gaussian noise to data, using a neural network trained with an analytically derived L2 loss, yielding simpler training, faster convergence, and deterministic sampling that matches or exceeds diffusion model quality.

AIDiffusion ModelsGenerative Modeling
0 likes · 13 min read
Flow Matching for Generative Modeling
AI Frontier Lectures
AI Frontier Lectures
Mar 11, 2025 · Artificial Intelligence

How Stochastic Differential Equations Power Modern Generative AI Models

This article explains how recent MIT research uses stochastic differential equations to model diffusion and flow processes, defines training objectives, explores conditional guidance, compares U‑Net and diffusion transformers, addresses memory challenges with latent diffusion, and surveys applications ranging from robotics to protein design.

Diffusion ModelsLatent DiffusionRobotics
0 likes · 26 min read
How Stochastic Differential Equations Power Modern Generative AI Models
AIWalker
AIWalker
Feb 13, 2025 · Artificial Intelligence

How FlashVideo Turns Low‑Res Clips into 4K Video with Minimal Compute

FlashVideo introduces a two‑stage framework that first generates low‑resolution videos with strong prompt fidelity and then uses flow‑matching ODE trajectories to upscale to 4K quality in just four function evaluations, achieving top VBench‑Long scores while cutting generation time by up to five‑fold.

AIFlashVideoVideo Generation
0 likes · 26 min read
How FlashVideo Turns Low‑Res Clips into 4K Video with Minimal Compute