Tagged articles
20 articles
Page 1 of 1
DeepHub IMBA
DeepHub IMBA
Mar 1, 2026 · Artificial Intelligence

Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization

This article walks through the fundamentals of variational autoencoders, explaining why they are needed, detailing their three core components, loss formulation, PyTorch implementation, training loop, and multiple inference modes such as anomaly detection, data generation, conditional generation, latent space manipulation, and data imputation.

Conditional VAEGenerative ModelsLatent Space
0 likes · 15 min read
Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization
Tencent Cloud Developer
Tencent Cloud Developer
Dec 17, 2025 · Artificial Intelligence

How Tencent’s TNC Neural Codec Won 2025 Image & Video Compression Challenges

At the end of 2025, Tencent’s Shannon Lab’s neural codec TNC achieved top rankings in both the VCIP low‑complexity end‑to‑end image compression contest and the PCS high‑compression intelligent image compression challenge, demonstrating superior PSNR gains, low decoding complexity, and innovative VAE‑INR architecture across image and video tracks.

AICompression ChallengeINR
0 likes · 17 min read
How Tencent’s TNC Neural Codec Won 2025 Image & Video Compression Challenges
HyperAI Super Neural
HyperAI Super Neural
Oct 30, 2025 · Artificial Intelligence

OmniCast Achieves 20× Speed Boost and Eliminates Autoregressive Error Accumulation in S2S Weather Forecasting

OmniCast, a novel latent diffusion model from UCLA and Argonne Lab, combines VAE and Transformer to generate high‑precision probabilistic sub‑seasonal to seasonal forecasts, dramatically reducing error accumulation of autoregressive methods and delivering 10‑20× faster inference while surpassing state‑of‑the‑art baselines across accuracy, physical consistency, and probabilistic metrics.

Deep LearningLatent DiffusionOmniCast
0 likes · 15 min read
OmniCast Achieves 20× Speed Boost and Eliminates Autoregressive Error Accumulation in S2S Weather Forecasting
AI Frontier Lectures
AI Frontier Lectures
May 11, 2025 · Artificial Intelligence

How VA‑VAE Boosts Diffusion Model Generation: SOTA Results & LightningDiT Insights

This article analyzes the VA‑VAE approach that aligns visual tokenizers with vision foundation models to resolve the reconstruction‑generation trade‑off in latent diffusion models, detailing the VF loss design, adaptive weighting, LightningDiT enhancements, experimental setup, and state‑of‑the‑art ImageNet performance.

LightningDiTVAEloss function
0 likes · 16 min read
How VA‑VAE Boosts Diffusion Model Generation: SOTA Results & LightningDiT Insights
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 4, 2024 · Artificial Intelligence

EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers

EasyAnimate, an open‑source DiT‑based video generation framework from Alibaba Cloud AI Platform PAI, offers a complete pipeline—including data preprocessing, VAE and DiT training, LoRA fine‑tuning, motion‑module integration, and scalable inference up to 768×768 resolution and 144 frames—leveraging Diffusion Transformers to produce longer, higher‑quality videos.

AI videoDiffusion TransformerLoRA
0 likes · 14 min read
EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers
DataFunSummit
DataFunSummit
May 6, 2024 · Artificial Intelligence

Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress

This article reviews the definition, key metrics, and major model families of AI‑generated content, details XiaoBu’s recent breakthroughs in image generation, and discusses open research problems such as evaluation gaps, transformer limitations, and the need for richer multimodal intelligence representations.

AIGCGANGenerative Models
0 likes · 14 min read
Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2024 · Artificial Intelligence

OpenAI Sora: Technical Principles and Industry Impact Analysis

OpenAI’s Sora, a text‑to‑video model released during Chinese New Year, combines a VAE encoder, latent diffusion with a DiT transformer, and a VAE decoder to generate videos from prompts, supporting flexible durations and resolutions, language understanding, and uses in creation, editing, and entertainment, though it struggles with physical consistency and long‑term coherence, and its debut is reshaping short‑form video, digital‑human, gaming, and graphics industries.

AI video generationLatent DiffusionOpenAI
0 likes · 14 min read
OpenAI Sora: Technical Principles and Industry Impact Analysis
DaTaobao Tech
DaTaobao Tech
Dec 4, 2023 · Artificial Intelligence

AIGC Poster Generation Project: Methods and Optimizations

The AIGC Poster Generation Project employs Stable Diffusion enhanced with VAE, ControlNet, LoRA and other extensions to create product posters in four visual styles, exploring outpainting, inpainting, reference‑based diffusion and DreamBooth prototypes, and optimizes detail preservation, super‑resolution text, and masking to achieve over 90% detail fidelity, 95% success rate, and 3–5 second inference per image.

AIGCControlNetStable Diffusion
0 likes · 7 min read
AIGC Poster Generation Project: Methods and Optimizations
DaTaobao Tech
DaTaobao Tech
Oct 13, 2023 · Artificial Intelligence

Understanding Stable Diffusion: Core Principles and Technical Architecture

The article demystifies Stable Diffusion by explaining its low‑cost latent‑space design and conditioning mechanisms, comparing it to autoregressive, VAE, flow‑based and GAN models, detailing the iterative noise‑to‑image process, token‑based text‑to‑image control, version differences, common generation issues, and providing implementation code examples.

AI image generationComputer VisionCross-Attention
0 likes · 15 min read
Understanding Stable Diffusion: Core Principles and Technical Architecture
php Courses
php Courses
Aug 26, 2023 · Artificial Intelligence

Understanding Generative AI: Concepts, Common Models, and Development Guide

Generative AI, a branch of artificial intelligence that creates novel content such as text, images, and music, works by learning patterns from training data, with common models including GANs, VAEs, autoregressive and Transformer-based architectures, and its development involves task definition, data preparation, model design, training, evaluation, and ethical considerations.

GANModel DevelopmentTransformer
0 likes · 8 min read
Understanding Generative AI: Concepts, Common Models, and Development Guide
DaTaobao Tech
DaTaobao Tech
Aug 11, 2023 · Artificial Intelligence

Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet

This practical guide walks users through installing Stable Diffusion WebUI, explains the differences between base, LoRA, VAE, and ControlNet models, shows how to derive prompts with CLIP or DeepBooru, and provides detailed text‑to‑image and image‑to‑image examples for effective prompt engineering.

AI image generationControlNetLoRA
0 likes · 12 min read
Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet
Code DAO
Code DAO
Dec 10, 2021 · Artificial Intelligence

Understanding Variational Autoencoders: From Dimensionality Reduction to Generative Modeling

This article explains the principles of variational autoencoders, starting with dimensionality reduction techniques such as PCA and standard autoencoders, highlighting their limitations for data generation, and then detailing VAE's regularized latent space, variational inference, re‑parameterization, and loss formulation.

Deep LearningGenerative ModelsKL divergence
0 likes · 18 min read
Understanding Variational Autoencoders: From Dimensionality Reduction to Generative Modeling
DataFunTalk
DataFunTalk
Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation

This article explores how AI techniques such as deep semantic matching, attention mechanisms, variational autoencoders, and neural topic models can transform traditional recruitment by improving person‑job matching, interview assistance, and text representation, supported by experiments on real‑world hiring data.

AI RecruitmentVAEinterview assistance
0 likes · 18 min read
Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation
DataFunSummit
DataFunSummit
Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models

This article presents a comprehensive overview of applying AI techniques—semantic matching models, attention mechanisms, VAE‑based text representation, and neural topic models—to improve talent acquisition, candidate‑job matching, interview assistance, and recruitment text analysis, supported by experiments on real‑world hiring data.

AI in HRIntelligent RecruitmentNeural Topic Model
0 likes · 19 min read
Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models
DataFunTalk
DataFunTalk
Jul 24, 2021 · Artificial Intelligence

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

This article presents the mechanisms of Taobao’s detail‑page full‑network distribution, introducing background, scenario description, and a series of algorithmic explorations—including CIDM, DTIN, and Tri‑tower models—that leverage the main product (trigger) to reinforce users’ instant interests, improve recall, coarse‑ranking, and fine‑ranking performance, and achieve notable online metric gains.

CTRDeep LearningModeling
0 likes · 17 min read
Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution
DataFunTalk
DataFunTalk
Jan 16, 2020 · Artificial Intelligence

Voice Conversion: Fundamentals, Methods, and iQIYI Applications

This article provides a comprehensive overview of voice conversion technology, covering its definition, parallel and non‑parallel data approaches, classic and deep‑learning methods such as DTW, GMM, seq2seq, PPG, VAE, Flow, GAN, and practical applications and challenges in iQIYI’s products.

ASRDeep LearningGAN
0 likes · 8 min read
Voice Conversion: Fundamentals, Methods, and iQIYI Applications
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 9, 2020 · Artificial Intelligence

Voice Conversion (VC): Fundamentals, Progress, and Applications

Voice conversion (VC) technology changes a speaker’s timbre and style while keeping the spoken text unchanged, supporting one‑to‑one, many‑to‑one, and many‑to‑many scenarios for medical assistance and entertainment, using parallel or non‑parallel data through methods such as DTW‑aligned frame mapping, attention‑based neural networks, PPG‑LSTM pipelines, VAEs, normalizing‑flow models, and GANs, with iQIYI focusing on non‑parallel data, prosody preservation, and noise‑robust augmentation.

Audio ProcessingDeep LearningGAN
0 likes · 12 min read
Voice Conversion (VC): Fundamentals, Progress, and Applications