How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting

Alibaba's DAMO Academy details its AI‑driven image cutout system, describing why automated matting is needed, the four‑module pipeline (filtering, classification, detection, segmentation), architectural innovations such as dual decoders and fusion networks, and how these advances enable product‑level batch background removal.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting

Since the announcement of Alibaba's DAMO Academy, the institute has attracted attention for its high‑end, mysterious research. This article explores how the team turned the "mysterious" expertise into practical image cutout technology.

Why start research on cutout? Alibaba Intelligent Design Lab created the design product Luban to automate banner, poster, and venue‑image generation, improving efficiency. Manual fine‑grained cutout of a person can take over two hours, a labor‑intensive task that AI can replace.

In recent years, image matting algorithms have emerged with significant commercial value across entertainment, e‑commerce, online education, and more. Existing methods struggle with hair‑level detail and generalization to diverse e‑commerce scenes.

Challenges and solutions The team built a system covering four modules: filtering, classification, detection, and segmentation.

Filtering removes low‑quality images (dark, over‑exposed, blurry, occluded) using classification models and basic image algorithms.

Classification handles different product categories (e.g., cosmetics vs. toys) and scene types (human head, animal) by designing specialized segmentation models.

Detection adds a detection‑and‑crop step before segmentation to handle multi‑object, multi‑category images with redundant elements such as text, logos, or decorative graphics.

Segmentation performs a coarse mask followed by a fine mask to achieve precise, hair‑level segmentation, improving both speed and accuracy.

How to make the effect more precise? Segmentation is the weakest link, so the team focused on three improvements:

Classification model: an automated tool with AutoML‑style search optimizes parameters and models under limited GPU resources, reducing manual effort.

Evaluation model: treats matting as a regression problem and uses traditional algorithms for over‑exposure and darkness detection.

Detection model: adopts a Feature Pyramid Network (FPN) architecture, fusing features across pyramid levels and designing scale‑aware anchors to boost small‑object recall.

The following image shows the overall architecture:

Matting predicts a foreground probability

and a background probability

, which are combined to compute pixel‑wise transparency.

The segmentation network uses a encoder‑decoder backbone with two decoders predicting foreground and background probabilities. For solid regions (alpha = 0 or 1) the network predicts the exact alpha; for semi‑transparent regions it predicts upper and lower bounds, using a weighted cross‑entropy loss to tighten the interval.

The fusion network, composed of consecutive convolutional layers, predicts the mixing weights, focusing training on semi‑transparent regions where gradients are informative.

Productization The research has been integrated into Alibaba’s commercial products, enabling batch white‑background generation for e‑commerce items, portrait background replacement, and other use cases.

Reference: A Late Fusion CNN for Digital Matting (CVPR 2019)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AlibabaComputer VisionAIDeep Learningimage segmentationdigital matting
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.