Tagged articles

VAE

21 articles · Page 1 of 1

Mar 1, 2026 · Artificial Intelligence

Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization

This article walks through the fundamentals of variational autoencoders, explaining why they are needed, detailing their three core components, loss formulation, PyTorch implementation, training loop, and multiple inference modes such as anomaly detection, data generation, conditional generation, latent space manipulation, and data imputation.

Anomaly DetectionConditional VAEPyTorch

0 likes · 15 min read

Demystifying VAE: From Probabilistic Encoding to Latent Space Regularization

xkx's Tech General Store

Jan 19, 2026 · Artificial Intelligence

Beginner’s Guide to VAE: Theory, Training, and Full Implementation

This article walks readers through the fundamentals of Variational Autoencoders, compares five major generative model paradigms, explains VAE architecture, training and inference steps, provides PyTorch code, and analyzes experimental results on MNIST and Flowers datasets.

MNISTPyTorchVAE

0 likes · 16 min read

Beginner’s Guide to VAE: Theory, Training, and Full Implementation

Tencent Cloud Developer

Dec 17, 2025 · Artificial Intelligence

How Tencent’s TNC Neural Codec Won 2025 Image & Video Compression Challenges

At the end of 2025, Tencent’s Shannon Lab’s neural codec TNC achieved top rankings in both the VCIP low‑complexity end‑to‑end image compression contest and the PCS high‑compression intelligent image compression challenge, demonstrating superior PSNR gains, low decoding complexity, and innovative VAE‑INR architecture across image and video tracks.

AICompression ChallengeINR

0 likes · 17 min read

How Tencent’s TNC Neural Codec Won 2025 Image & Video Compression Challenges

HyperAI Super Neural

Oct 30, 2025 · Artificial Intelligence

OmniCast Achieves 20× Speed Boost and Eliminates Autoregressive Error Accumulation in S2S Weather Forecasting

OmniCast, a novel latent diffusion model from UCLA and Argonne Lab, combines VAE and Transformer to generate high‑precision probabilistic sub‑seasonal to seasonal forecasts, dramatically reducing error accumulation of autoregressive methods and delivering 10‑20× faster inference while surpassing state‑of‑the‑art baselines across accuracy, physical consistency, and probabilistic metrics.

OmniCastTransformerVAE

0 likes · 15 min read

OmniCast Achieves 20× Speed Boost and Eliminates Autoregressive Error Accumulation in S2S Weather Forecasting

Amap Tech

Jul 11, 2025 · Artificial Intelligence

Unified Self‑Supervised Pretraining Accelerates Image Generation and Improves Understanding

The USP framework introduces masked latent modeling within a VAE space to pre‑train ViT encoders, enabling seamless weight transfer to both image classification, segmentation, and diffusion‑based generation tasks, dramatically speeding up DiT and SiT models while preserving strong visual representations.

Diffusion ModelsVAEViT³

0 likes · 13 min read

Unified Self‑Supervised Pretraining Accelerates Image Generation and Improves Understanding

AI Frontier Lectures

May 11, 2025 · Artificial Intelligence

How VA‑VAE Boosts Diffusion Model Generation: SOTA Results & LightningDiT Insights

This article analyzes the VA‑VAE approach that aligns visual tokenizers with vision foundation models to resolve the reconstruction‑generation trade‑off in latent diffusion models, detailing the VF loss design, adaptive weighting, LightningDiT enhancements, experimental setup, and state‑of‑the‑art ImageNet performance.

LightningDiTVAEVision Foundation Model

0 likes · 16 min read

How VA‑VAE Boosts Diffusion Model Generation: SOTA Results & LightningDiT Insights

Alibaba Cloud Big Data AI Platform

Jun 4, 2024 · Artificial Intelligence

EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers

EasyAnimate, an open‑source DiT‑based video generation framework from Alibaba Cloud AI Platform PAI, offers a complete pipeline—including data preprocessing, VAE and DiT training, LoRA fine‑tuning, motion‑module integration, and scalable inference up to 768×768 resolution and 144 frames—leveraging Diffusion Transformers to produce longer, higher‑quality videos.

AI videoLoRAVAE

0 likes · 14 min read

EasyAnimate: High‑Resolution Video Generation via Diffusion Transformers

DataFunSummit

May 6, 2024 · Artificial Intelligence

Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress

This article reviews the definition, key metrics, and major model families of AI‑generated content, details XiaoBu’s recent breakthroughs in image generation, and discusses open research problems such as evaluation gaps, transformer limitations, and the need for richer multimodal intelligence representations.

AIGCGaNPrompt engineering

0 likes · 14 min read

Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress

Tencent Cloud Developer

Feb 21, 2024 · Artificial Intelligence

OpenAI Sora: Technical Principles and Industry Impact Analysis

OpenAI’s Sora, a text‑to‑video model released during Chinese New Year, combines a VAE encoder, latent diffusion with a DiT transformer, and a VAE decoder to generate videos from prompts, supporting flexible durations and resolutions, language understanding, and uses in creation, editing, and entertainment, though it struggles with physical consistency and long‑term coherence, and its debut is reshaping short‑form video, digital‑human, gaming, and graphics industries.

AI video generationOpenAISora

0 likes · 14 min read

OpenAI Sora: Technical Principles and Industry Impact Analysis

Architecture and Beyond

Feb 8, 2024 · Artificial Intelligence

Mastering AIGC: 15 Essential AI Terms and Key Technologies Explained

This article provides a comprehensive overview of core AI concepts, from basic definitions of AI, AGI, and AIGC to detailed explanations of GPUs, major generative models, leading AI products, and influential companies, helping readers quickly grasp the landscape of AI-generated content.

AIAIGCCLIP

0 likes · 24 min read

Mastering AIGC: 15 Essential AI Terms and Key Technologies Explained

DaTaobao Tech

Dec 4, 2023 · Artificial Intelligence

AIGC Poster Generation Project: Methods and Optimizations

The AIGC Poster Generation Project employs Stable Diffusion enhanced with VAE, ControlNet, LoRA and other extensions to create product posters in four visual styles, exploring outpainting, inpainting, reference‑based diffusion and DreamBooth prototypes, and optimizes detail preservation, super‑resolution text, and masking to achieve over 90% detail fidelity, 95% success rate, and 3–5 second inference per image.

AIGCControlNetDiffusion Models

0 likes · 7 min read

AIGC Poster Generation Project: Methods and Optimizations

DaTaobao Tech

Oct 13, 2023 · Artificial Intelligence

Understanding Stable Diffusion: Core Principles and Technical Architecture

The article demystifies Stable Diffusion by explaining its low‑cost latent‑space design and conditioning mechanisms, comparing it to autoregressive, VAE, flow‑based and GAN models, detailing the iterative noise‑to‑image process, token‑based text‑to‑image control, version differences, common generation issues, and providing implementation code examples.

AI image generationCross-AttentionStable Diffusion

0 likes · 15 min read

Understanding Stable Diffusion: Core Principles and Technical Architecture

php Courses

Aug 26, 2023 · Artificial Intelligence

Understanding Generative AI: Concepts, Common Models, and Development Guide

Generative AI, a branch of artificial intelligence that creates novel content such as text, images, and music, works by learning patterns from training data, with common models including GANs, VAEs, autoregressive and Transformer-based architectures, and its development involves task definition, data preparation, model design, training, evaluation, and ethical considerations.

GaNGenerative AIModel Development

0 likes · 8 min read

Understanding Generative AI: Concepts, Common Models, and Development Guide

DaTaobao Tech

Aug 11, 2023 · Artificial Intelligence

Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet

This practical guide walks users through installing Stable Diffusion WebUI, explains the differences between base, LoRA, VAE, and ControlNet models, shows how to derive prompts with CLIP or DeepBooru, and provides detailed text‑to‑image and image‑to‑image examples for effective prompt engineering.

AI image generationControlNetLoRA

0 likes · 12 min read

Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet

Code DAO

Dec 10, 2021 · Artificial Intelligence

Understanding Variational Autoencoders: From Dimensionality Reduction to Generative Modeling

This article explains the principles of variational autoencoders, starting with dimensionality reduction techniques such as PCA and standard autoencoders, highlighting their limitations for data generation, and then detailing VAE's regularized latent space, variational inference, re‑parameterization, and loss formulation.

KL divergenceVAEVariational Autoencoder

0 likes · 18 min read

Understanding Variational Autoencoders: From Dimensionality Reduction to Generative Modeling

DataFunTalk

Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation

This article explores how AI techniques such as deep semantic matching, attention mechanisms, variational autoencoders, and neural topic models can transform traditional recruitment by improving person‑job matching, interview assistance, and text representation, supported by experiments on real‑world hiring data.

AI recruitmentVAEinterview assistance

0 likes · 18 min read

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation

DataFunSummit

Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models

This article presents a comprehensive overview of applying AI techniques—semantic matching models, attention mechanisms, VAE‑based text representation, and neural topic models—to improve talent acquisition, candidate‑job matching, interview assistance, and recruitment text analysis, supported by experiments on real‑world hiring data.

AI in HRIntelligent RecruitmentNeural Topic Model

0 likes · 19 min read

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models

DataFunTalk

Jul 24, 2021 · Artificial Intelligence

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

This article presents the mechanisms of Taobao’s detail‑page full‑network distribution, introducing background, scenario description, and a series of algorithmic explorations—including CIDM, DTIN, and Tri‑tower models—that leverage the main product (trigger) to reinforce users’ instant interests, improve recall, coarse‑ranking, and fine‑ranking performance, and achieve notable online metric gains.

CTRTaobaoVAE

0 likes · 17 min read

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

DataFunTalk

Jan 16, 2020 · Artificial Intelligence

Voice Conversion: Fundamentals, Methods, and iQIYI Applications

This article provides a comprehensive overview of voice conversion technology, covering its definition, parallel and non‑parallel data approaches, classic and deep‑learning methods such as DTW, GMM, seq2seq, PPG, VAE, Flow, GAN, and practical applications and challenges in iQIYI’s products.

ASRGaNSpeech synthesis

0 likes · 8 min read

Voice Conversion: Fundamentals, Methods, and iQIYI Applications

iQIYI Technical Product Team

Jan 9, 2020 · Artificial Intelligence

Voice Conversion (VC): Fundamentals, Progress, and Applications

Voice conversion (VC) technology changes a speaker’s timbre and style while keeping the spoken text unchanged, supporting one‑to‑one, many‑to‑one, and many‑to‑many scenarios for medical assistance and entertainment, using parallel or non‑parallel data through methods such as DTW‑aligned frame mapping, attention‑based neural networks, PPG‑LSTM pipelines, VAEs, normalizing‑flow models, and GANs, with iQIYI focusing on non‑parallel data, prosody preservation, and noise‑robust augmentation.

Audio ProcessingGaNVAE

0 likes · 12 min read

Voice Conversion (VC): Fundamentals, Progress, and Applications

Hulu Beijing

Mar 19, 2019 · Artificial Intelligence

Understanding Variational Autoencoders: Core Concepts and Training Explained

This article introduces Variational Autoencoders (VAEs), compares them with GANs, explains the underlying variational inference principle, and details how VAEs are trained using the evidence lower bound, complemented by visual diagrams and key equations.

VAEVariational Autoencoderdeep learning

0 likes · 4 min read

Understanding Variational Autoencoders: Core Concepts and Training Explained