Tag

VAE

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
May 6, 2024 · Artificial Intelligence

Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress

This article reviews the definition, key metrics, and major model families of AI‑generated content, details XiaoBu’s recent breakthroughs in image generation, and discusses open research problems such as evaluation gaps, transformer limitations, and the need for richer multimodal intelligence representations.

AI researchAIGCGAN
0 likes · 14 min read
Advances, Model Types, and Open Challenges of AI‑Generated Content (AIGC) with XiaoBu’s Image Generation Progress
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2024 · Artificial Intelligence

OpenAI Sora: Technical Principles and Industry Impact Analysis

OpenAI’s Sora, a text‑to‑video model released during Chinese New Year, combines a VAE encoder, latent diffusion with a DiT transformer, and a VAE decoder to generate videos from prompts, supporting flexible durations and resolutions, language understanding, and uses in creation, editing, and entertainment, though it struggles with physical consistency and long‑term coherence, and its debut is reshaping short‑form video, digital‑human, gaming, and graphics industries.

AI video generationDiffusion TransformerOpenAI
0 likes · 14 min read
OpenAI Sora: Technical Principles and Industry Impact Analysis
DaTaobao Tech
DaTaobao Tech
Dec 4, 2023 · Artificial Intelligence

AIGC Poster Generation Project: Methods and Optimizations

The AIGC Poster Generation Project employs Stable Diffusion enhanced with VAE, ControlNet, LoRA and other extensions to create product posters in four visual styles, exploring outpainting, inpainting, reference‑based diffusion and DreamBooth prototypes, and optimizes detail preservation, super‑resolution text, and masking to achieve over 90% detail fidelity, 95% success rate, and 3–5 second inference per image.

AIGCControlNetPoster Design
0 likes · 7 min read
AIGC Poster Generation Project: Methods and Optimizations
DaTaobao Tech
DaTaobao Tech
Oct 13, 2023 · Artificial Intelligence

Understanding Stable Diffusion: Core Principles and Technical Architecture

The article demystifies Stable Diffusion by explaining its low‑cost latent‑space design and conditioning mechanisms, comparing it to autoregressive, VAE, flow‑based and GAN models, detailing the iterative noise‑to‑image process, token‑based text‑to‑image control, version differences, common generation issues, and providing implementation code examples.

AI image generationCross-AttentionStable Diffusion
0 likes · 15 min read
Understanding Stable Diffusion: Core Principles and Technical Architecture
php中文网 Courses
php中文网 Courses
Aug 26, 2023 · Artificial Intelligence

Understanding Generative AI: Concepts, Common Models, and Development Guide

Generative AI, a branch of artificial intelligence that creates novel content such as text, images, and music, works by learning patterns from training data, with common models including GANs, VAEs, autoregressive and Transformer-based architectures, and its development involves task definition, data preparation, model design, training, evaluation, and ethical considerations.

GANGenerative AIModel Development
0 likes · 8 min read
Understanding Generative AI: Concepts, Common Models, and Development Guide
DaTaobao Tech
DaTaobao Tech
Aug 11, 2023 · Artificial Intelligence

Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet

This practical guide walks users through installing Stable Diffusion WebUI, explains the differences between base, LoRA, VAE, and ControlNet models, shows how to derive prompts with CLIP or DeepBooru, and provides detailed text‑to‑image and image‑to‑image examples for effective prompt engineering.

AI image generationControlNetLoRA
0 likes · 12 min read
Practical Guide to Stable Diffusion WebUI: Prompt Engineering, LoRA, VAE, and ControlNet
DataFunTalk
DataFunTalk
Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation

This article explores how AI techniques such as deep semantic matching, attention mechanisms, variational autoencoders, and neural topic models can transform traditional recruitment by improving person‑job matching, interview assistance, and text representation, supported by experiments on real‑world hiring data.

AI recruitmentTopic ModelingVAE
0 likes · 18 min read
Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation
DataFunSummit
DataFunSummit
Oct 13, 2021 · Artificial Intelligence

Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models

This article presents a comprehensive overview of applying AI techniques—semantic matching models, attention mechanisms, VAE‑based text representation, and neural topic models—to improve talent acquisition, candidate‑job matching, interview assistance, and recruitment text analysis, supported by experiments on real‑world hiring data.

AI in HRIntelligent RecruitmentNeural Topic Model
0 likes · 19 min read
Intelligent Recruitment: Deep Semantic Matching, Interview Assistance, and Text Representation with VAE and Neural Topic Models
DataFunTalk
DataFunTalk
Jul 24, 2021 · Artificial Intelligence

Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution

This article presents the mechanisms of Taobao’s detail‑page full‑network distribution, introducing background, scenario description, and a series of algorithmic explorations—including CIDM, DTIN, and Tri‑tower models—that leverage the main product (trigger) to reinforce users’ instant interests, improve recall, coarse‑ranking, and fine‑ranking performance, and achieve notable online metric gains.

Deep LearningVAEctr
0 likes · 17 min read
Instant Interest Reinforcement and Extension for Taobao Detail Page Distribution
DataFunTalk
DataFunTalk
Jan 16, 2020 · Artificial Intelligence

Voice Conversion: Fundamentals, Methods, and iQIYI Applications

This article provides a comprehensive overview of voice conversion technology, covering its definition, parallel and non‑parallel data approaches, classic and deep‑learning methods such as DTW, GMM, seq2seq, PPG, VAE, Flow, GAN, and practical applications and challenges in iQIYI’s products.

ASRDeep LearningGAN
0 likes · 8 min read
Voice Conversion: Fundamentals, Methods, and iQIYI Applications
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 9, 2020 · Artificial Intelligence

Voice Conversion (VC): Fundamentals, Progress, and Applications

Voice conversion (VC) technology changes a speaker’s timbre and style while keeping the spoken text unchanged, supporting one‑to‑one, many‑to‑one, and many‑to‑many scenarios for medical assistance and entertainment, using parallel or non‑parallel data through methods such as DTW‑aligned frame mapping, attention‑based neural networks, PPG‑LSTM pipelines, VAEs, normalizing‑flow models, and GANs, with iQIYI focusing on non‑parallel data, prosody preservation, and noise‑robust augmentation.

Audio ProcessingDeep LearningGAN
0 likes · 12 min read
Voice Conversion (VC): Fundamentals, Progress, and Applications