How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting

Alibaba's DAMO Academy presents an AI‑driven image cutout system that combines filtering, classification, detection, and advanced segmentation to automate high‑precision matting, improve design efficiency, and unlock new commercial opportunities across e‑commerce and media industries.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s DAMO Lab Revolutionizes Image Cutout with AI‑Powered Matting

Since the establishment of Alibaba’s DAMO Academy, the team has been exploring high‑end AI research, leading to a dedicated image cutout (matting) solution for the Alibaba Intelligent Design Lab’s “Luban” product.

Why develop a cutout algorithm?

Designers spend over two hours per portrait for precise cutout, a labor‑intensive task that hampers efficiency. Automating this with AI improves banner, poster, and venue image production, unifies visual style, and boosts conversion rates.

Industry demand and challenges

Image matting is valuable across e‑commerce, entertainment, education, and other verticals. Existing methods struggle with fine hair details and general‑scene robustness, prompting the need for a more generalized and high‑precision approach.

System architecture

The solution consists of four modules: filtering, classification, detection, and segmentation.

Filtering: Removes low‑quality images (dark, overexposed, blurry) using classification models and basic image algorithms.

Classification: Tailors models to product categories (e.g., cosmetics, 3C, toys) and scene types (human, animal) to improve segmentation.

Detection: Crops redundant elements such as logos or text before segmentation for higher accuracy.

Segmentation: Performs a coarse mask followed by a fine mask to achieve hair‑level precision and speed.

Precision improvements

Segmentation remains the weakest link; the team enhanced it by designing a dual‑decoder network that predicts foreground and background probabilities and a regression‑based loss for semi‑transparent regions.

Equation for pixel transparency:

where the two following images represent foreground and background probabilities respectively:

Network design

The segmentation network uses an encoder‑decoder backbone with two decoders that output foreground and background probabilities. For fully opaque or transparent pixels, the network predicts the exact alpha value; for semi‑transparent pixels it predicts upper and lower bounds, guided by a weighted cross‑entropy loss.

The fusion network, composed of several consecutive convolutional layers, predicts the mixing weight for each pixel, focusing training on semi‑transparent regions where gradients are non‑zero.

Productization

The technology powers multiple Alibaba products, including batch white‑background generation, portrait and animal cutout, and upcoming cartoon, fashion, and panoramic cutout features.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AlibabaComputer VisionDeep Learningimage segmentationAI mattingcutout algorithm
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.