Artificial Intelligence 19 min read

Precise and Fast Object Segmentation Algorithms – Talk by Ren Haibing (Youku Cognitive Lab)

Ren Haibing’s Youku Cognitive Lab talk reviews object segmentation’s motivation, explains semantic and instance concepts, presents UNet‑based and category‑agnostic methods—including fast video segmentation with motion cues—and reports high IoU results while outlining future edge‑aware, label‑free, and non‑online video segmentation research directions.

Youku Technology

Apr 29, 2019

Precise and Fast Object Segmentation Algorithms – Talk by Ren Haibing (Youku Cognitive Lab)

This document is a written version of Ren Haibing’s presentation at the Youku Technology Salon – Youku Cognitive Lab session, titled “Precise and Fast Object Segmentation Algorithms”. The talk covers the motivation, basic concepts, research progress, and future directions of object segmentation in computer vision.

Motivation and Background : Object segmentation is a classic problem in computer vision (CV). With the rapid development of deep learning, segmentation research has made significant advances. The speaker emphasizes the importance of high‑quality data (e.g., ImageNet, COCO, Pascal VOC) and the challenges of obtaining precise annotations, which can take 10–20 minutes per object.

Basic Concepts : Semantic segmentation assigns a class label to each pixel (e.g., the 20 classes of Pascal VOC). Instance segmentation further distinguishes individual object instances. Common evaluation metrics include Intersection‑over‑Union (IoU) and Average Precision (AP). State‑of‑the‑art semantic models such as FCN, DeepLab series (V1‑V3+) are mentioned, with DeepLab‑V3+ offering accurate edge predictions but lacking instance discrimination.

Image‑Level Object Segmentation : The team first focuses on category‑related segmentation using UNet‑based architectures, leveraging skip connections and feature concatenation to combine high‑level (global) and low‑level (local) features. They discuss the limitations of purely machine‑learning‑driven approaches, especially the poor generalization of local features to unseen object categories.

Category‑Agnostic Segmentation : To handle objects without prior training data, the speaker introduces a category‑agnostic approach inspired by Mask R‑CNN and MaskLab. By providing a user‑drawn bounding box (or minimal interactive cues such as points or lines), the algorithm can segment previously unseen categories (e.g., tigers) using strong semantic constraints and motion information.

Video Object Segmentation : The presentation describes a fast video segmentation pipeline that uses the first‑frame annotation (a box or points) and motion cues to propagate masks across frames. The method combines a segmentation network, a deep‑learning‑based tracker, and mask propagation/refinement modules, aiming to avoid costly online fine‑tuning.

Experimental Results : On a custom human dataset, the proposed method achieves IoU > 95 % (average 96–97 %). Multi‑class experiments show competitive performance on COCO‑style data, though data‑label inconsistencies remain a challenge. The team also reports successful instance‑level segmentation of people, motorcycles, and other objects with minimal user interaction.

Future Work : The roadmap includes (1) a precise edge‑aware segmentation algorithm constrained by semantic understanding, (2) a category‑agnostic video segmentation method that minimizes manual labeling, and (3) a fast, non‑online‑learning video segmentation strategy leveraging motion cues, mask propagation, and refinement.

The talk concludes with acknowledgments and an invitation to participate in the ongoing Youku video enhancement and super‑resolution competition.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision AI Deep Learning video segmentation category-agnostic object segmentation

Written by

Youku Technology

Discover top-tier entertainment technology here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.