How to Train a Multi‑Stage AI Model for a Brand Mascot – Tmall Case Study
This article explores the challenges of using AI image generators for brand IP, compares Midjourney and Stable Diffusion results, and presents a step‑by‑step multi‑layer model training workflow—including dataset creation, training optimization, and practical tips—to achieve a more expressive and consistent Tmall mascot.
Background
AI image‑generation models such as Midjourney and Stable Diffusion are increasingly used for brand‑IP creation. The core technical problem is to preserve the exact visual identity of a brand (high fidelity) while allowing creative variation (high imagination).
Midjourney Experiment
Prompt used: Make an IP character about black colour cat. After many iterations the output was a generic black cat that did not retain the distinctive features of the Tmall mascot, demonstrating Midjourney’s limited controllability for precise brand customization.
Stable Diffusion Experiment
Stable Diffusion produced more consistent silhouettes but lacked imaginative poses and dynamic expression. The results were still far from a satisfactory Tmall mascot, indicating the need for further fine‑tuning and custom training.
Multi‑Layer Model Training Workflow
Define IP Model Objectives – Decide whether the goal is a distinctive digital avatar (fixed head, variable body) or richer scene expression. For the Tmall mascot the head must remain constant while the body can adopt varied poses.
Dataset Construction & Optimization – Collect a diverse set of assets covering multiple colors, angles, and styles. Annotate each image with detailed tags (e.g., national‑style text) to guide style learning.
Model Training & Optimization – Perform iterative training while monitoring subtle changes. Adjust hyper‑parameters such as learning rate (e.g., 1e‑4 → 5e‑5), batch size (e.g., 8 → 16), and number of epochs. Incorporate ControlNet (e.g., Canny edge detection) to generate additional edge‑based training material.
Stage‑wise Evaluation – Review outputs after each training phase:
Stage 1: Basic shape, many errors.
Stage 2: Direction improves but still lacks surprise.
Stage 3: Higher pose variety, but visual quality becomes oily.
Stage 4: Creative composition improves, yet background remains empty.
Consumer‑Grade AI Creation Tools
Comparative tests show that prompt engineering (e.g., adding style descriptors, specifying pose) and parameter tuning (sampling steps, CFG scale) dramatically affect output quality. A baseline Stable Diffusion run produces bland results, while a tuned configuration yields sharper, more expressive images.
AIPK Project Insights (Double‑11 Campaign)
The Double‑11 “AI Creation Wins Red Packet” project required sub‑second image generation for millions of users. Technical measures included:
Compressing training and inference data to reduce I/O latency.
Switching to faster sampling methods (e.g., DDIM with reduced steps).
Deploying a smaller checkpoint (e.g., 1.5 B parameters instead of 2.7 B) while preserving core style features.
Streamlining the generation pipeline to eliminate unnecessary preprocessing.
Practical Tips from the Project
Gather training images with varied colors and angles (e.g., historic Double‑11 brand posters) to improve model generalization.
Provide detailed annotations (e.g., consistent national‑style text) to guide style learning.
Experiment with different learning rates and batch sizes to balance global structure learning and fine‑grained detail.
Leverage ControlNet (Canny edge maps) to synthesize additional training pairs and enrich pose diversity.
By integrating these techniques, designers can accelerate the creation of brand‑consistent AI‑generated imagery while retaining creative flexibility.
Taobao Design
Taobao Design, a design team serving the experience of billions of global consumers. Leading UX, creating designs that move people, and making business beautiful and simple.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
