Artificial Intelligence 8 min read

Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection

ARSL, an ambiguity‑resistant semi‑supervised learning framework for single‑stage object detection, introduces Joint‑Confidence Estimation and Task‑Separation Assignment to resolve selection and assignment ambiguities in pseudo‑labels, thereby markedly improving pseudo‑label quality and achieving state‑of‑the‑art AP gains on COCO benchmarks.

Baidu Tech Salon
Baidu Tech Salon
Baidu Tech Salon
Ambiguity-Resistant Semi-supervised Learning (ARSL) for Single-stage Object Detection

The paper proposes an Ambiguity-Resistant Semi-supervised Learning (ARSL) algorithm for single‑stage semi‑supervised object detection, introducing two generic modules: Joint‑Confidence Estimation (JCE) and Task‑Separation Assignment (TSA). JCE jointly evaluates the quality of pseudo‑labels by combining classification confidence and localization quality, while TSA partitions samples into positive, negative, and ambiguous candidates based on the teacher model’s joint confidence, further selecting potential positives for classification and localization tasks.

Background and Motivation : Deep‑learning based object detectors usually require large annotated datasets. To reduce labeling cost, semi‑supervised object detection (SSOD) leverages a small set of labeled data and abundant unlabeled data, typically using a Mean‑Teacher framework and pseudo‑labeling. However, under this pipeline, single‑stage detectors (e.g., FCOS) gain far less improvement than two‑stage detectors (e.g., Faster R‑CNN).

What Limits Single‑stage Detectors in SSOD? Quantitative analysis reveals two major ambiguities in pseudo‑labels: Selection Ambiguity (mismatch between classification confidence and localization quality) and Assignment Ambiguity (incorrect sample‑to‑box assignments). These ambiguities are more severe in single‑stage detectors because they rely heavily on dense predictions.

Algorithm Overview : ARSL addresses the above issues with:

Joint‑Confidence Estimation (JCE) : A dual‑branch network predicts classification scores and localization quality; their product serves as a joint confidence score. For labeled data, IoU‑based soft labels are used for joint training; for unlabeled data, the teacher’s maximum joint confidence is directly employed.

Task‑Separation Assignment (TSA) : Instead of box‑based assignment, TSA uses the teacher’s joint confidence to split samples into negative, positive, and ambiguous groups via a double‑threshold strategy. Ambiguous samples are further filtered for each task: all are used for classification consistency learning, while only those with high similarity to true positives (in class, geometry, and location) are kept for localization training.

Experimental Results : On COCO‑Standard splits (1%, 2%, 5%, 10% labeled data) ARSL consistently outperforms state‑of‑the‑art SSOD methods, with larger gains when large‑scale jittering is added. On COCO‑Full, ARSL achieves a more significant improvement within a shorter training schedule.

Ablation Studies : The ablations show that the baseline FCOS gains only 4.7% AP under the basic semi‑supervised framework, while ARSL raises it to 36.9% AP (+6.2% AP). JCE contributes ~4.0% AP and TSA ~2.2% AP. Further analysis confirms that JCE filters higher‑quality pseudo‑labels and TSA dramatically increases true positives while reducing false positives.

Links to the paper (https://arxiv.org/abs/2303.14960) and code repository (https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/semi_det) are provided for reproducibility.

computer visionobject detectionSemi-supervised LearningARSLjoint confidence estimationsingle-stage detectortask separation assignment
Baidu Tech Salon
Written by

Baidu Tech Salon

Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.