Artificial Intelligence 8 min read

ArcCSE: Angular Margin Contrastive Learning for Self‑Supervised Text Representation

ArcCSE introduces an angular‑margin contrastive loss and both pairwise (dropout‑augmented) and triple‑wise (span‑masked) relationship modeling to self‑supervise text embeddings, yielding tighter decision boundaries, higher alignment and uniformity, and superior performance on unsupervised STS, SentEval, and Alibaba’s retrieval and recommendation systems.

DaTaobao Tech

Apr 12, 2022

ArcCSE: Angular Margin Contrastive Learning for Self‑Supervised Text Representation

Learning high-quality text representations is a fundamental NLP task, but pretrained models such as BERT often yield suboptimal results on semantic similarity evaluations when used without fine‑tuning.

This paper proposes ArcCSE, a novel self‑supervised framework that introduces an angular margin into contrastive learning and explicitly models semantic partial‑order relations among sentences.

ArcCSE comprises pairwise and triple‑wise relationship modeling. Pairwise positive pairs are obtained via dropout‑based augmentation, while triple‑wise triplets are constructed by masking different spans of a sentence to generate entailment‑like relations. The new Angular Margin Contrastive Loss (ArcCon) replaces the conventional NT‑Xent loss, providing a tighter decision boundary and greater robustness to noise.

Extensive experiments on unsupervised STS benchmarks and SentEval transfer tasks demonstrate that ArcCSE consistently surpasses SimCSE and other state‑of‑the‑art self‑supervised methods, achieving higher alignment and uniformity scores and improving downstream classification performance.

The method has been deployed in Alibaba’s content‑understanding platforms, enhancing retrieval and recommendation in scenarios such as Taobao Live and Xianyu.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

contrastive learning NLP self-supervised learning angular margin sentence embeddings text representation

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.