How Baidu’s AI Publisher Transforms Holiday Images with Offline and Online Style Transfer
This article details Baidu APP’s AI Publisher, explaining the research behind its offline and online stylization modes, the complete generation pipelines, core AI technologies such as template creation, face‑merging, large‑model style transfer, custom model training, and showcases the resulting festive visual effects.
1. AI Imaging Research – Offline & Online Stylization
Generative AI video and image content has become popular on social platforms, especially during festivals and trending events. Baidu’s UGC product introduced an AI Publisher to democratize AI creation, launching five AI-driven themes for the Chinese New Year, such as "AI迎财神" and "AI龙潮儿".
2. Offline Stylization Workflow
Offline stylization follows three stages: template generation, face‑fusion, and AI video synthesis. Templates are designed per theme and demographic (adult male, adult female, boy, girl). After generating a static template, the user’s face is merged using Baidu’s proprietary VIS face‑fusion technology, which preserves facial similarity, skin tone, and handles occlusions like bangs or glasses.
Finally, the fused static image is combined with video effects; the Publisher offers more than 15 video styles covering various AI portrait scenarios.
3. Online Stylization Workflow
Online stylization provides a more customized experience. The pipeline consists of: user photo upload → face detection → facial region generation → model‑based image generation (style model + facial control model) → secondary generation → final output.
The style model determines visual style (realistic, anime, illustration, etc.), while the facial control model extracts user features to maintain identity. Different themes use either open‑source large models (e.g., for AI烟花) or custom‑trained models (e.g., AI召唤神龙).
4. Custom Model Training
For themes lacking existing base models, Baidu developed custom models. The AI汉服 model was trained on over 5,000 clothing samples, creating 12 male, female, and child outfits with labeled tags for generation. The composition model for AI龙潮儿 controls the placement of dragons relative to the user, supporting more than ten layout variations and ensuring Chinese‑style dragon aesthetics.
5. Online Face‑Fusion Research
Online stylization also relies on advanced face‑fusion. Five facial control models provide ten different control strategies, adapting to each theme’s style requirements—for example, basic facial control for realistic AI烟花, reduced facial precision for water‑ink AI召唤神龙, and balanced control for AI寻找龙潮儿 and AI汉服.
6. Online Effect Showcase
The AI Publisher has been refined over a quarter and now supports more than ten festive activities, including New Year, Women’s Day, Spring Flowers, and Children’s Day. Future plans aim to deepen AI research and launch additional custom activities, expanding possibilities for AI‑generated imagery.
Baidu MEUX
MEUX, Baidu Mobile Ecosystem UX Design Center, handling end-to-end experience design for user and commercial products in Baidu's mobile ecosystem. Send resumes to [email protected]
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
