Discovering and Enhancing Robustness in Low‑Resource Information Extraction
This article examines the robustness challenges of information extraction tasks such as NER and relation extraction, introduces the Entity Coverage Ratio metric, analyzes why pretrained models like BERT may “take shortcuts,” and proposes evaluation tools and training strategies—including mutual‑information‑based methods, negative‑training, and flooding—to improve model robustness across diverse scenarios.
1. Information Extraction Overview
Information extraction (IE) converts unstructured text into structured facts such as entities, attributes, and relations, forming the foundation for knowledge graphs. The two primary IE sub‑tasks are Named Entity Recognition (NER) and relation extraction.
2. Main Frameworks for Entity Recognition
Traditional machine‑learning approaches rely on handcrafted features (e.g., capitalization, prefixes) and algorithms like HMM/CRF. Deep‑learning models, especially pretrained language models, automatically learn semantic features, allowing researchers to focus on model design rather than feature engineering.
3. A Hidden Problem
Deep models often “take shortcuts” by exploiting the easiest features, achieving high test‑set performance but failing in real‑world scenarios, revealing robustness issues.
4. Robustness Investigation for IE Tasks
The Entity Coverage Ratio (ECR, ρ) is defined to evaluate robustness:
ρ = 1: the entity appears in both training and test sets with identical labels.
0 < ρ < 1: the entity appears in both sets but with multiple possible labels.
ρ = 0, C ≠ 0: the entity’s test label differs from its training label.
ρ = 0, C = 0: the entity is out‑of‑vocabulary (OOV) in the training data.
Experiments show that when label consistency breaks or OOV occurs, model accuracy drops sharply, confirming that BERT also suffers from shortcut behavior.
5. Detecting BERT’s Robustness
Heuristic word replacements generate adversarial samples, but low realism limits their usefulness, raising doubts about conclusions drawn from such evaluations.
6. Unified Multilingual Robustness Evaluation Tool – TextFlint
TextFlint offers high availability (20 generic + 60 specific tasks), human‑acceptable transformations, and analytical capabilities for robustness assessment.
7. Improving NER Robustness
Demonstrated that BERT’s accuracy drops severely under linguistically valid perturbations, proving it is not robust.
Identified differences between academic benchmarks (high regularity, high mention rate) and open‑domain data.
Proposed perturbations: NP (same replacement), MP (different replacements), CR/MR (context reduction / entity reduction).
Observed that NP/MP cause large accuracy drops, indicating models rely on entity memorization rather than context.
8. Mutual Information‑Based Method
Introduce mutual information I(X;Z) and conditional mutual information I(X;Z|Y) to encourage representations Z to capture more context Y while discarding noise X. Maximizing I(Z,Y) and minimizing I(X;Z|Y) leads to better robustness.
9. Improving Relation Extraction Robustness
Remote supervision assumes every sentence containing an entity pair expresses their relation, which introduces noise.
Noise‑reduction methods: assumption‑based filtering, attention mechanisms, reinforcement‑learning‑based dynamic negative example identification.
Shift from positive‑only training to negative‑training: treat non‑target descriptions as correct negatives, enabling the model to distinguish noise without sacrificing data volume.
10. General Robustness Enhancement Techniques
Adversarial training (inefficient due to sample generation).
Flooding: maintain loss around a “flooding level” to prevent over‑fitting, proven effective in NLP.
Conclusion
The presented methods—mutual‑information‑guided representation learning, negative‑training frameworks, and flooding—significantly improve the robustness of IE models, especially under low‑resource and OOV conditions.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.