How SpanProto Boosts Few-Shot NER Accuracy with a Two-Stage Span Approach

SpanProto, a two‑stage span‑based prototypical network, dramatically improves few‑shot named entity recognition by extracting candidate spans with a global boundary matrix and classifying them via prototypical and margin learning, achieving notable gains on the Few‑NERD benchmark with minimal labeled data.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How SpanProto Boosts Few-Shot NER Accuracy with a Two-Stage Span Approach
SpanProto overview
SpanProto overview

Alibaba Cloud's Machine Learning Platform PAI, together with Prof. Gao Ming's team from East China Normal University and the DAMO Academy NLP team, presented the SpanProto algorithm for few‑shot named entity recognition (NER) at EMNLP 2022.

Background

Large pretrained language models have greatly advanced NLP tasks, but traditional NER still requires abundant annotated data. In many real‑world scenarios, labeled data are scarce, and existing sequence‑labeling methods struggle with label dependency and nested entities. SpanProto addresses the N‑way K‑shot NER setting, exemplified by a 2‑way 1‑shot task.

2‑way 1‑shot NER illustration
2‑way 1‑shot NER illustration

In this scenario each support set contains only one annotated example per class (e.g., PER and LOC), while the query set contains unseen instances.

Algorithm Overview

SpanProto decomposes NER into two stages: Span Extraction and Mention Classification. The overall framework is illustrated below.

SpanProto model architecture
SpanProto model architecture

Span Extraction

SpanProto first employs a class‑agnostic span extractor, inspired by Baffine Decoder and Global Pointer, to predict a Global Boundary Matrix where each cell (i, j) indicates whether the token interval [i:j] forms an entity.

Global Boundary Matrix illustration
Global Boundary Matrix illustration

The extractor is trained with a span‑based cross‑entropy loss:

Span extraction loss formula
Span extraction loss formula

Mention Classification

For each extracted span, SpanProto applies prototypical learning, assigning the label whose prototype is closest in Euclidean distance. To mitigate false positives—spans that have no appropriate label in the current episode—a margin learning objective pushes such span representations away from all class prototypes.

Margin learning for false positives
Margin learning for false positives

Overall Algorithm Flow

SpanProto workflow diagram
SpanProto workflow diagram

Algorithm Accuracy Evaluation

SpanProto was evaluated on the Few‑NERD benchmark, showing a clear accuracy improvement over baselines.

Few‑NERD evaluation results
Few‑NERD evaluation results

Module‑wise analysis indicates that both Span Extraction and Mention Classification contribute positively to performance.

Ablation study results
Ablation study results
Component contribution chart
Component contribution chart

The source code will be contributed to the open‑source EasyNLP framework, inviting NLP researchers and practitioners to use it.

EasyNLP repository: https://github.com/alibaba/EasyNLP

References

Jianing Wang, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang, Jun Huang, Ming Gao. SpanProto: A Two‑stage Span‑based Prototypical Network For Few‑shot Named Entity Recognition. EMNLP 2022.

Chengyu Wang, Minghui Qiu, Taolin Zhang, Tingting Liu, Lei Li, Jianing Wang, Ming Wang, Jun Huang, Wei Lin. EasyNLP: A Comprehensive and Easy‑to‑use Toolkit for Natural Language Processing. EMNLP 2022 (accepted).

Juntao Yu, Bernd Bohnet, Massimo Poesio. Named Entity Recognition as Dependency Parsing. ACL 2020: 6470‑6476.

Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Haitao Zheng, Zhiyuan Liu. Few‑NERD: A Few‑shot Named Entity Recognition Dataset. ACL/IJCNLP 2021: 3198‑3213.

GlobalPointer: Unified Approach for Nested and Flat NER. https://spaces.ac.cn/archives/8373

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

NLPnamed entity recognitionEMNLP 2022prototypical networkspan extraction
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.