7-Step Architecture Framework for AI Product Management: A Hands‑On Case Study
This article walks through a real‑world AI‑driven image generation system for cross‑border e‑commerce, detailing business pain points, stakeholder analysis, technical selection, MVP scope, architecture decisions, metric funnels, gray‑release strategy, and continuous evolution that cut per‑image cost to under ¥0.5 and delivery time to one minute.
Business Baseline and Pain Point Definition
Baseline measurements of the cross‑border e‑commerce visual‑asset pipeline:
Cost per SKU main image: ¥65 (including logistics, studio scheduling, scene setup, photography and post‑processing).
Cycle time: 3–5 days average, up to 9 days during peak sales.
Capacity: ~10,000 images/month, costing >¥600,000, with slow cadence wasting high‑frequency testing opportunities.
Project goal: reduce per‑image cost to ≤¥1, compress delivery to minute‑level, and keep image realism at or above the commercial conversion baseline.
Requirement Decomposition & Stakeholder Analysis
Root‑cause analysis (5 Why)
Why is it slow and expensive? Each image requires on‑site shooting and manual retouch.
Why on‑site shooting? Different SKUs need non‑standard, high‑quality scene tones.
Why can’t digital tools replace the scene? Traditional cut‑out + paste cannot reproduce global illumination, contact shadows, and correct perspective.
Core conclusion: The missing capability is a low‑cost way to reconstruct realistic physical lighting. A 1‑minute generation of 1,000 images is useless if physics are wrong, because conversion rates collapse.
Stakeholder demand matrix (internal users) guided later decisions.
Architecture decision: strip B‑class user requirements and make the MVP 100 % compatible with the efficient pipeline of A‑class operations, because large models are probabilistic and conflict with deterministic industrial needs.
Technical Selection & Feasibility Assessment
SaaS/API evaluation (e.g., Photoroom, Midjourney API)
General models suffer severe feature‑generalization issues on high‑reflectivity or complex hollow structures, yielding a blind‑test usable rate 30 % .
Token‑based SaaS pricing scales linearly with >100k monthly calls, causing exponential cost growth and exposing raw images to compliance risk.
Open‑source native stack (Stable Diffusion WebUI)
Compute cost is low, but front‑end pipelines like ComfyUI are overly obscure; requiring operators to master parameters such as CFG Scale or Denoising Strength would drive adoption to zero.
Breakthrough path: develop a private vertical fine‑tuned LoRA model and encapsulate interaction.
Technical route : deploy a private large‑model base, use high‑conversion reference images, and train LoRA weights for home‑goods and pet categories.
Expected benefit : initial hardware investment ≈ ¥300 k; after model convergence, inference availability > 70 % ; per‑image compute cost ¥0.3–¥0.5.
Decision basis: accept heavy upfront R&D to gain absolute data safety, high availability, and long‑term marginal cost decline.
MVP Boundary Definition & Category Control
Risk of “trying to do everything” – training a universal model would cause severe feature dilution.
Core category isolation :
In‑Scope (do): Best‑selling home and pet items – visual features are highly uniform and tolerant to lighting variations, enabling rapid early convergence and building business trust.
Out‑Scope (postpone): Apparel and 3C – clothing involves complex skeletal constraints and fabric folds that trigger the uncanny valley; 3C metal/plastic textures easily pollute the latent space. Both are deferred to a next‑generation dedicated model.
Do: Automated background removal, preset high‑frequency commercial scene library, asynchronous high‑concurrency generation queue.
Don’t: Custom prompt input UI, local retouch masks, AI face swap (would scramble compute allocation and break the minimalist focus).
Product Architecture & Interaction Abstraction
Large‑model logic is probabilistic, but B‑side products require certainty. The core task is to perfectly mask backend randomness at the front‑end interaction layer .
When a user clicks a thumbnail (e.g., “Nordic Morning”), the backend silently executes a complex schedule: concatenate dozens of positive prompt keywords, attach a curated negative‑prompt library, inject the specific lighting LoRA weight, adjust the step count, and finally return an image. This encapsulation delivers an “one‑click, instant‑image” industrial experience.
Metric Funnel & Risk‑Control System
Before writing code, a three‑layer data funnel was embedded to monitor model health and calculate commercial ROI.
Why 72 % usable‑rate? Derived from UX experience and compute ROI math. Generating four images per request with >70 % success yields an average of 2.8 qualified images, guaranteeing the “no‑re‑draw, instant‑pick” baseline. Pushing the target to 90 % would cause exponential increases in labeling cost and compute consumption, destroying economic viability.
Data‑Driven Gray‑Release Strategy
Adopted a “3‑3‑1 gray‑release rule” based on data convergence as an absolute gate.
Gate 0 (internal seed, 20 users): Target low‑requirement bulk‑upload operators to probe model resilience on extreme long‑tail data and collect the first batch of structural bad cases for baseline tuning.
Gate 1 (expanded, 80 users): Introduce high‑ticket boutique teams. If L2 usable‑rate falls below 72 %, immediately throttle volume and supplement training data for the problematic material.
Gate 2 (full rollout): Open to all business lines once L2 stays above the threshold for five consecutive workdays, then release a standardized SOP manual.
Cold‑blooded stop‑loss line: If after ten weeks the core category’s usable‑rate remains <60 %, the project is terminated and the workflow reverts to traditional outsourcing.
Continuous Architecture Evolution & Artifact Archiving
Post‑launch, a mapping route from user‑facing complaints to engineering, inference, and training layers drives ongoing improvement:
Performance layer: “I want to upload 50 items in batch” → triggers an engineering iteration that adds an asynchronous bulk‑upload module.
Probability layer: “Some edges look too harsh” → drives an inference iteration that introduces a low‑cost ControlNet depth‑map constraint.
Feature layer: “A new cup material renders completely wrong” → feeds the training layer; the bad case is added to a pool, and once a threshold is reached, a LoRA weight fine‑tune is launched.
Final outcome : per‑image total cost dropped to ¥0.5 , delivery time compressed from three days to one minute , business penetration exceeded 65 %, and 30 designers were liberated from manual shooting to focus on brand‑creative design.
Appendix: Core Deliverables of a Senior AI PM
Stakeholder‑pain analysis matrix – anchors the “physical reconstruction” first‑principle.
Technical selection and ROI accounting model – tallies compute cost, hardware amortization, and labor substitution.
MVP category & feature control table (In/Out matrix) – defines data the model cannot consume and rejects non‑standard pseudo‑requirements.
Data funnel & monitoring definition document – establishes L1‑L3 metric definitions, anchoring a 72 % usable‑rate north star.
Gray‑release & melt‑down plan – specifies three‑stage scaling triggers and a cold‑blooded stop‑loss.
Architecture evolution roadmap – maps user complaints to engineering, inference, and training optimizations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
PMTalk Product Manager Community
One of China's top product manager communities, gathering 210,000 product managers, operations specialists, designers and other internet professionals; over 800 leading product experts nationwide are signed authors; hosts more than 70 product and growth events each year; all the product manager knowledge you want is right here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
