Artificial Intelligence 15 min read

How Alibaba Detects ‘Disgusting’ Images on Taobao with AI

This article describes Alibaba's AI system for automatically filtering nauseating product images on Taobao, covering challenges such as cold‑start, class imbalance, and diverse visual features, and detailing solutions like semi‑supervised learning, active learning, OHEM‑cascade, attention mechanisms, and the resulting business impact.

Alibaba Cloud Developer

Dec 3, 2019

How Alibaba Detects ‘Disgusting’ Images on Taobao with AI

Abstract

Disgusting images—those that cause nausea—appear in three categories: animal‑related, human‑related, and object‑related. Detecting them among billions of product images faces three technical challenges: few initial samples (cold‑start), severe class imbalance, and highly diverse feature distribution.

Cold‑Start Solution

We expanded the training set by retrieving hundreds of disgusting images from Taobao’s content pool using a small‑sample retrieval platform and applied semi‑supervised learning with a lightweight MobileNet‑V2 backbone. The network output is modeled as Y = f_W(X), where W are parameters and X the input image. A mean‑teacher framework updates the student network while the teacher network tracks an exponential moving average of the student parameters.

Algorithm Iteration Process

Online data exhibits an extreme positive‑negative imbalance (far less than 0.1 % positive). To improve the model we combined active learning, noise‑sample identification, and online hard example mining (OHEM) with cascade training.

Active learning selects samples with confidence between two thresholds (images shown in the figure) and prioritises hard examples for annotation.

The loss‑prediction (LP) module, built on top of the target‑prediction (TP) module, predicts relative loss for each sample pair, guiding the selection of difficult examples.

OHEM + cascade replaces MobileNet‑V2 with DenseNet‑161 as the first stage and feeds its highest‑loss samples to a second ResNet‑50 classifier; inference combines both predictions.

Attention Mechanism

Since disgusting cues are often localized, we embed a CBAM‑style attention block (channel + spatial) into the backbone. Grad‑CAM visualisations confirm that the network focuses on the nauseating regions.

Business Impact

Deployed in Taobao’s “Guess You Like” (首猜) recommendation, the model scans the entire product pool, achieving 95 % precision and 94 % recall. It filtered millions of low‑quality images during the Double‑11 shopping festival, reduced manual review workload by ~70 %, and supported a multi‑task platform for broader image‑quality detection.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

image classification Attention Mechanism e‑commerce AI Semi-supervised Learning Active Learning

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.