Artificial Intelligence 25 min read

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Alibaba Mom’s AI‑driven content risk platform has evolved from simple rule‑matching to a data‑centric, serverless architecture that integrates large‑model acceleration, decision‑tree compilation, high‑throughput vector retrieval and elastic word‑matching, delivering sub‑100 ms text and sub‑1 s image moderation while remaining stable during peak promotional traffic.

Alimama Tech

Dec 14, 2023

AI-Driven Content Risk Control: System Evolution and Optimization at Alibaba

Business Background and Challenges

Content is a key marketing carrier; risky ads can harm platform reputation. Alibaba Mom aims to control real‑time ad changes and quickly locate and clean billions of stored ads.

AI wave introduces new AI‑generated creative tools, bringing new risk characteristics and higher requirements for response speed, compute resources and cost.

Challenges

Variety, volatility and fast spread of risky content.

Higher demands on detection capability, latency, resources and cost.

Specific ability challenges (peak load after promotions), efficiency challenges (real‑time AI content interception), cost challenges (high compute for images, video, live), quality challenges (large‑scale evaluation).

AI‑Driven Risk Engine Evolution

Since 2013, Alibaba Mom’s content risk system evolved through three stages:

Stage 1 – Rule‑based

Simple keyword, blacklist/whitelist, and basic attribute rules; limited against variant risks.

Stage 2 – Model‑assisted

Introduce algorithmic models; still rule‑driven, but manual threshold tuning becomes costly.

Stage 3 – Data + Algorithm

Data‑driven risk control with models guided by domain experts; supports custom business needs and emergency handling.

System Construction Layers

Stage 1 – Simple pipeline

Word matching, rules, blacklist/whitelist via synchronous calls.

Stage 2 – DAG‑based async

Model and retrieval services added; asynchronous Metaq calls; DAG with >1000 nodes becomes performance bottleneck.

Stage 3 – Serverless split

Separate DataFlow (sample building) and ControlFlow (sample consumption) with a gateway; DAG‑based concurrent scheduling of downstream services.

Large‑Model Acceleration

Re‑engineered model service; adopted BLIP, CLIP, and XGB for risk filtering. Chose CUDA‑Graph based Kangaroo‑Engine for inference acceleration.

Kangaroo‑Engine Features

Multiple captures for dynamic shapes.

GPU memory reuse across graphs.

Shape‑bucket padding to limit graph count.

Achieves ~2× RT reduction for BLIP on A10.

Device‑Specific Optimizations

On P100, parallel streams raise QPS by ~25%; on T4, TensorRT‑8.6 reduces ViT block latency by 5×.

Traditional ML Model Acceleration

Adopt Treelite for decision‑tree inference; compiles trees to .so for branch prediction, delivering order‑of‑magnitude speedup over XGBoost.

Hundred‑Billion‑Scale Retrieval Service

Unified online/offline engine based on Dolphin VectorDB; supports real‑time updates, high QPS, and consistent results.

Full‑Elastic Sensitive‑Word Matching Service

Switch to Wu‑Manber algorithm for lower memory and incremental updates; integrate with vector retrieval for unified indexing.

Cloud‑Native DevOps Management

Unified operation platform for model, retrieval, and word services; plugin‑style release, logical resource pools, and automated scaling.

Business Support and Future Outlook

Supports AI‑generated content moderation with sub‑100 ms text and sub‑1 s image latency; stable during major promotions; plans for further performance, elasticity, and unified online‑near‑offline services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI devops model acceleration content moderation large-scale retrieval risk control

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Business Background and Challenges

Challenges

AI‑Driven Risk Engine Evolution

Stage 1 – Rule‑based

Stage 2 – Model‑assisted

Stage 3 – Data + Algorithm

System Construction Layers

Stage 1 – Simple pipeline

Stage 2 – DAG‑based async

Stage 3 – Serverless split

Large‑Model Acceleration

Kangaroo‑Engine Features

Device‑Specific Optimizations

Traditional ML Model Acceleration

Hundred‑Billion‑Scale Retrieval Service

Full‑Elastic Sensitive‑Word Matching Service

Cloud‑Native DevOps Management

Business Support and Future Outlook

Alimama Tech

How this landed with the community

Was this worth your time?

0 Comments

Stage 1 – Rule‑based

Stage 2 – Model‑assisted

Stage 3 – Data + Algorithm

Stage 1 – Simple pipeline

Stage 2 – DAG‑based async

Stage 3 – Serverless split