Artificial Intelligence 25 min read

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Alibaba Mom’s AI‑driven content risk platform has evolved from simple rule‑matching to a data‑centric, serverless architecture that integrates large‑model acceleration, decision‑tree compilation, high‑throughput vector retrieval and elastic word‑matching, delivering sub‑100 ms text and sub‑1 s image moderation while remaining stable during peak promotional traffic.

Alimama Tech
Alimama Tech
Alimama Tech
AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Business Background and Challenges

Content is a key marketing carrier; risky ads can harm platform reputation. Alibaba Mom aims to control real‑time ad changes and quickly locate and clean billions of stored ads.

AI wave introduces new AI‑generated creative tools, bringing new risk characteristics and higher requirements for response speed, compute resources and cost.

Challenges

Variety, volatility and fast spread of risky content.

Higher demands on detection capability, latency, resources and cost.

Specific ability challenges (peak load after promotions), efficiency challenges (real‑time AI content interception), cost challenges (high compute for images, video, live), quality challenges (large‑scale evaluation).

AI‑Driven Risk Engine Evolution

Since 2013, Alibaba Mom’s content risk system evolved through three stages:

Stage 1 – Rule‑based

Simple keyword, blacklist/whitelist, and basic attribute rules; limited against variant risks.

Stage 2 – Model‑assisted

Introduce algorithmic models; still rule‑driven, but manual threshold tuning becomes costly.

Stage 3 – Data + Algorithm

Data‑driven risk control with models guided by domain experts; supports custom business needs and emergency handling.

System Construction Layers

Stage 1 – Simple pipeline

Word matching, rules, blacklist/whitelist via synchronous calls.

Stage 2 – DAG‑based async

Model and retrieval services added; asynchronous Metaq calls; DAG with >1000 nodes becomes performance bottleneck.

Stage 3 – Serverless split

Separate DataFlow (sample building) and ControlFlow (sample consumption) with a gateway; DAG‑based concurrent scheduling of downstream services.

Large‑Model Acceleration

Re‑engineered model service; adopted BLIP, CLIP, and XGB for risk filtering. Chose CUDA‑Graph based Kangaroo‑Engine for inference acceleration.

Kangaroo‑Engine Features

Multiple captures for dynamic shapes.

GPU memory reuse across graphs.

Shape‑bucket padding to limit graph count.

Achieves ~2× RT reduction for BLIP on A10.

Device‑Specific Optimizations

On P100, parallel streams raise QPS by ~25%; on T4, TensorRT‑8.6 reduces ViT block latency by 5×.

Traditional ML Model Acceleration

Adopt Treelite for decision‑tree inference; compiles trees to .so for branch prediction, delivering order‑of‑magnitude speedup over XGBoost.

Hundred‑Billion‑Scale Retrieval Service

Unified online/offline engine based on Dolphin VectorDB; supports real‑time updates, high QPS, and consistent results.

Full‑Elastic Sensitive‑Word Matching Service

Switch to Wu‑Manber algorithm for lower memory and incremental updates; integrate with vector retrieval for unified indexing.

Cloud‑Native DevOps Management

Unified operation platform for model, retrieval, and word services; plugin‑style release, logical resource pools, and automated scaling.

Business Support and Future Outlook

Supports AI‑generated content moderation with sub‑100 ms text and sub‑1 s image latency; stable during major promotions; plans for further performance, elasticity, and unified online‑near‑offline services.

AIDevOpsmodel accelerationcontent moderationlarge-scale retrievalrisk control
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.