Artificial Intelligence 10 min read

How Collaborative Learning Boosts Co‑Saliency Object Detection

This article introduces a collaborative‑learning based co‑saliency object detection algorithm that incorporates class‑conditioned information during training, dramatically improving the model's ability to distinguish and detect common objects across image groups, and demonstrates its effectiveness through extensive experiments and real‑world applications.

Kuaishou Audio & Video Technology

Jun 11, 2021

How Collaborative Learning Boosts Co‑Saliency Object Detection

Background

Human vision can detect the most attractive object in a single image and extract co‑occurring objects from a set of images. In computer vision, the former is called salient object detection, while the latter is known as collaborative (co‑) saliency object detection. Existing co‑saliency methods struggle to differentiate objects of different categories.

Proposed Approach

We propose a collaborative‑learning based co‑saliency detection algorithm that injects class‑conditioned information during training, enabling the network to detect objects according to given class conditions and significantly enhancing discrimination and overall performance.

Application Scenarios

In multi‑user settings, the algorithm can discover common objects among images uploaded by different users, facilitating interest‑based social networking. For single‑user scenarios, it enables efficient batch processing such as bulk object cut‑out, improving image handling efficiency.

Algorithm Overview

The model consists of an encoder, a decoder, a global relation learning module, a semantic classification learning module, and a global collaborative learning module (used only during training). The encoder extracts feature maps, the global relation learning module derives a shared representation for an image group, and deep‑separable filtering combines this representation with the feature maps before decoding.

Global Relation Learning Module

This module captures common features across the entire image set. It processes feature maps with a convolution, computes a pairwise relation matrix, aggregates via max and mean operations to form a global relation map, multiplies it with the original feature map, and averages to obtain the group‑wise common representation.

Semantic Classification Learning Module

Using image‑level class labels, this module assists the global relation learning module by supervising the extracted group representation through a convolutional network followed by a fully‑connected layer.

Global Collaborative Learning Module

Through contrastive learning, this module ensures that the extracted group representation is unique and correct for its image set while being absent for other sets, thereby enlarging inter‑class distances and improving discrimination.

Experimental Results

Extensive experiments on three datasets show that each of the three modules contributes positively, and their combination yields substantial performance gains. Compared with prior methods, our approach achieves superior results, especially on the challenging CoCA dataset.

Illustrations

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

image segmentation collaborative learning co-saliency detection

Written by

Kuaishou Audio & Video Technology

Explore the stories behind Kuaishou's audio and video technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Proposed Approach

Application Scenarios

Algorithm Overview

Global Relation Learning Module

Semantic Classification Learning Module

Global Collaborative Learning Module

Experimental Results

Related Links

Illustrations

Kuaishou Audio & Video Technology

How this landed with the community

Was this worth your time?

0 Comments