Artificial Intelligence 9 min read

KECP: Enhancing Few-Shot Machine Reading Comprehension via Knowledge-Driven Prompt Tuning

KECP, a Knowledge‑Enhanced Contrastive Prompt‑tuning model, achieves strong few‑shot extractive question answering by converting questions to masked statements, injecting external knowledge via gated fusion, and leveraging contrastive learning alongside masked language modeling, as demonstrated on EMNLP‑2022 benchmarks.

Alibaba Cloud Big Data AI Platform

Dec 8, 2022

KECP: Enhancing Few-Shot Machine Reading Comprehension via Knowledge-Driven Prompt Tuning

Background

Machine Reading Comprehension (MRC) traditionally requires large amounts of annotated data to fine‑tune pretrained language models such as BERT. In extractive MRC, a passage and a question are given, and the answer is a text span within the passage. Conventional approaches use sequence labeling or pointer networks, which often overfit in low‑resource (few‑shot) scenarios.

Prompt‑tuning mitigates overfitting by reformulating downstream tasks as masked language modeling objectives, allowing the model to reuse pretrained knowledge.

Algorithm Overview

KECP (Knowledge‑Enhanced Contrastive Prompt‑tuning) combines prompt‑tuning with knowledge injection and contrastive learning to improve few‑shot extractive QA performance.

Model Input

The question is converted into a cloze‑style statement with [MASK] tokens. For example, the question "What was one of the Normans’ major exports?" becomes "[MASK] [MASK] [MASK] was one of the Normans’ major exports." The masked statement is concatenated with the passage to form a single input sequence.

Knowledge‑Enhanced Semantic Representation

To compensate for limited training data, KECP injects external knowledge from a knowledge base (e.g., Wikidata5M). Entities in the passage are identified and their embeddings are fused with word embeddings via a gated unit, producing knowledge‑aware passage representations.

To avoid knowledge noise, the enriched passage vectors are aggregated into a few selected tokens (e.g., key nouns) in the question using self‑attention.

Contrastive Learning Enhanced Training

The fused representations are fed into BERT and trained with the standard Masked Language Modeling (MLM) objective. Additionally, a contrastive loss is added: the ground‑truth answer serves as the positive sample, while incorrectly retrieved entities from the knowledge base act as negatives.

KECP jointly minimizes MLM and contrastive losses to obtain the final QA model.

Evaluation

KECP was evaluated on several standard MRC datasets by randomly sampling 16 training examples per dataset. The results show that KECP consistently outperforms baseline methods, demonstrating its effectiveness in few‑shot settings.

Future work includes extending KECP to generative models such as BART and T5, and releasing the code in the EasyNLP framework for the NLP community.

References

Jianing Wang, Chengyu Wang, Minghui Qiu, Qiuhui Shi, Hongbin Wang, Jun Huang, Ming Gao. KECP: Knowledge‑Enhanced Contrastive Prompting for Few‑shot Extractive Question Answering. EMNLP 2022.

Chengyu Wang et al. EasyNLP: A Comprehensive and Easy‑to‑use Toolkit for Natural Language Processing. EMNLP 2022.

Xi Li, Percy Liang. Prefix‑Tuning: Optimizing Continuous Prompts for Generation. ACL/IJCNLP 2021.

Ori Ram et al. Few‑Shot Question Answering by Pretraining Span Selection. ACL/IJCNLP 2021.

Rakesh Chada, Pradeep Natarajan. Few‑shotQA: A simple framework for few‑shot learning of question answering tasks using pre‑trained text‑to‑text models. EMNLP 2021.

Mandar Joshi et al. SpanBERT: Improving Pre‑training by Representing and Predicting Spans. TACL 2020.

Xiao Liu et al. P‑Tuning v2: Prompt Tuning Can Be Comparable to Fine‑tuning Universally Across Scales and Tasks. arXiv 2021.

Paper Information

Title: KECP: Knowledge‑Enhanced Contrastive Prompting for Few‑shot Extractive Question Answering

Authors: Wang Jianing, Wang Chengyu, Qiu Minghui, Shi Qiuhui, Wang Hongbin, Huang Jun, Gao Ming

PDF: https://arxiv.org/abs/2205.03071

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

contrastive learning NLP knowledge injection machine reading comprehension

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.