Artificial Intelligence 8 min read

How SpanProto Boosts Few-Shot NER Accuracy with a Two-Stage Span Approach

SpanProto, a two‑stage span‑based prototypical network, dramatically improves few‑shot named entity recognition by extracting candidate spans with a global boundary matrix and classifying them via prototypical and margin learning, achieving notable gains on the Few‑NERD benchmark with minimal labeled data.

Alibaba Cloud Big Data AI Platform

Dec 9, 2022

How SpanProto Boosts Few-Shot NER Accuracy with a Two-Stage Span Approach

Alibaba Cloud's Machine Learning Platform PAI, together with Prof. Gao Ming's team from East China Normal University and the DAMO Academy NLP team, presented the SpanProto algorithm for few‑shot named entity recognition (NER) at EMNLP 2022.

Background

Large pretrained language models have greatly advanced NLP tasks, but traditional NER still requires abundant annotated data. In many real‑world scenarios, labeled data are scarce, and existing sequence‑labeling methods struggle with label dependency and nested entities. SpanProto addresses the N‑way K‑shot NER setting, exemplified by a 2‑way 1‑shot task.

In this scenario each support set contains only one annotated example per class (e.g., PER and LOC), while the query set contains unseen instances.

Algorithm Overview

SpanProto decomposes NER into two stages: Span Extraction and Mention Classification. The overall framework is illustrated below.

Span Extraction

SpanProto first employs a class‑agnostic span extractor, inspired by Baffine Decoder and Global Pointer, to predict a Global Boundary Matrix where each cell (i, j) indicates whether the token interval [i:j] forms an entity.

The extractor is trained with a span‑based cross‑entropy loss:

Mention Classification

For each extracted span, SpanProto applies prototypical learning, assigning the label whose prototype is closest in Euclidean distance. To mitigate false positives—spans that have no appropriate label in the current episode—a margin learning objective pushes such span representations away from all class prototypes.

Overall Algorithm Flow

Algorithm Accuracy Evaluation

SpanProto was evaluated on the Few‑NERD benchmark, showing a clear accuracy improvement over baselines.

Module‑wise analysis indicates that both Span Extraction and Mention Classification contribute positively to performance.

The source code will be contributed to the open‑source EasyNLP framework, inviting NLP researchers and practitioners to use it.

EasyNLP repository: https://github.com/alibaba/EasyNLP

References

Jianing Wang, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang, Jun Huang, Ming Gao. SpanProto: A Two‑stage Span‑based Prototypical Network For Few‑shot Named Entity Recognition. EMNLP 2022.

Chengyu Wang, Minghui Qiu, Taolin Zhang, Tingting Liu, Lei Li, Jianing Wang, Ming Wang, Jun Huang, Wei Lin. EasyNLP: A Comprehensive and Easy‑to‑use Toolkit for Natural Language Processing. EMNLP 2022 (accepted).

Juntao Yu, Bernd Bohnet, Massimo Poesio. Named Entity Recognition as Dependency Parsing. ACL 2020: 6470‑6476.

Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Haitao Zheng, Zhiyuan Liu. Few‑NERD: A Few‑shot Named Entity Recognition Dataset. ACL/IJCNLP 2021: 3198‑3213.

GlobalPointer: Unified Approach for Nested and Flat NER. https://spaces.ac.cn/archives/8373

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NLP Named Entity Recognition EMNLP 2022 prototypical network span extraction

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.