Artificial Intelligence 11 min read

Can Label Over‑Smooth (LOS) Boost Long‑Tail Classification? New Metrics and Benchmarks Revealed

This article analyzes classifier re‑training for long‑tailed visual recognition, introduces two novel evaluation metrics—Logits Magnitude and Regularized Standard Deviation—proposes the Label Over‑Smooth (LOS) method, and demonstrates its state‑of‑the‑art performance across CIFAR‑100‑LT, ImageNet‑LT, and iNaturalist2018 datasets.

Bilibili Tech

Feb 14, 2025

Can Label Over‑Smooth (LOS) Boost Long‑Tail Classification? New Metrics and Benchmarks Revealed

01. Introduction

Real‑world data often follows a long‑tail distribution: a few head classes dominate the sample count while many tail classes have very few examples. Traditional classifiers excel on balanced data but tend to ignore minority classes in long‑tail scenarios. Decoupled training separates feature learning from classifier re‑training, yet existing works improve both stages simultaneously, making it hard to isolate the classifier’s contribution. Recent studies show that a simple regularizer can yield robust features, and a well‑trained classifier alone can surpass many complex pipelines. Therefore, a unified benchmark for the classifier re‑training stage is needed to identify factors that truly improve performance.

02. Benchmark Construction and Exploration

Using a unified feature representation, we re‑implemented several representative re‑training strategies (re‑weighting, re‑sampling, parameter regularization, etc.) and expressed them with a common formula. The resulting performance comparison is shown in Figure 2.

We also formalized two new evaluation metrics:

Logits Magnitude (LoMa) : for each class, compute the difference between the mean logit of the true class and the mean logit of all other classes.

Regularized Standard Deviation (RSD) : the standard deviation of logits divided by the corresponding LoMa.

Experiments reveal that a more balanced LoMa across classes correlates with higher accuracy, while RSD remains relatively stable and can be treated as an invariant during analysis.

03. The LOS Method

Motivated by the observation that imbalanced LoMa introduces class‑dependent noise, we propose Label Over‑Smooth (LOS) . LOS transforms the original one‑hot label into a softened continuous distribution, reducing the dominance of head classes. The formulation is: K denotes the number of classes, and \epsilon controls the probability assigned to non‑ground‑truth classes. Unlike traditional label smoothing (typically \epsilon=0.2), LOS allows \epsilon up to 0.98 – 0.99, substantially diminishing the bias introduced by noisy logits.

Proposition 3 demonstrates that, while balancing data has limited effect on balanced datasets, reducing class‑specific noise is crucial for long‑tailed data. LOS therefore directly lowers Logits Magnitude, mitigating the bias and improving predictions.

04. Experimental Results

We evaluated LOS on three long‑tailed benchmarks:

CIFAR‑100‑LT (ResNet‑34 backbone, smoothing factor 0.98)

ImageNet‑LT (ResNeXt‑50 backbone, smoothing factor 0.99)

iNaturalist2018 (ResNet‑50 backbone, smoothing factor 0.99)

In all cases, LOS achieved new state‑of‑the‑art accuracy, outperforming prior methods cited in [2]–[5]. Moreover, LOS can be plugged into existing pipelines; combined with self‑supervised pre‑training (PaCo, BCL, GML, ProCo), data augmentation (OPeN, NCL), multi‑expert ensembles (RIDE), and transfer learning (SSD), it further boosts performance, as illustrated in Figure 4.

05. Conclusion

We provided a comprehensive analysis of classifier re‑training for long‑tailed recognition, introduced Logits Magnitude and Regularized Standard Deviation as insightful metrics, and proposed the Label Over‑Smooth (LOS) regularization. Extensive experiments confirm that LOS consistently delivers superior accuracy across diverse long‑tail datasets and can serve as a plug‑and‑play module for existing re‑training frameworks.

References

[1] Decoupling representation and classifier for long‑tailed recognition. ICLR 2019.

[2] Long‑tailed recognition via weight balancing. CVPR 2022.

[3] Learning imbalanced datasets with label‑distribution‑aware margin loss. NeurIPS 2019.

[4] Balanced meta‑softmax for long‑tailed visual recognition. NeurIPS 2020.

[5] Long‑tail learning via logit adjustment. ICLR 2021.

[6] Parametric contrastive learning. ICCV 2021.

[7] Balanced contrastive learning for long‑tailed visual recognition. CVPR 2022.

[8] Long‑tailed recognition by mutual information maximization. ICML 2023.

[9] Probabilistic contrastive learning for long‑tailed visual recognition. TPAMI 2024.

[10] Pure noise to the rescue of insufficient data. ICML 2022.

[11] Nested collaborative learning for long‑tailed visual recognition. CVPR 2022.

[12] Long‑tailed recognition by routing diverse distribution‑aware experts. ICLR 2020.

[13] Self‑supervision to distillation for long‑tailed visual recognition. ICCV 2021.

machine learning benchmark label smoothing logits magnitude long-tailed classification regularized standard deviation

Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.