Boosting Cross-Lingual Machine Reading Comprehension with X-STA: A New Knowledge Transfer Approach

The X-STA algorithm, introduced by Alibaba Cloud’s PAI and researchers from South China University of Technology, leverages gradient‑decomposed knowledge sharing, teacher‑guided attention, and multi‑level alignment to enhance cross‑lingual machine reading comprehension, achieving state‑of‑the‑art results on three multilingual MRC benchmarks.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Boosting Cross-Lingual Machine Reading Comprehension with X-STA: A New Knowledge Transfer Approach

Background

Large-scale pretrained language models have dramatically improved NLP tasks, but traditional MRC requires abundant annotated data, which low‑resource languages lack. Most MRC datasets are English, and linguistic differences across languages (e.g., Japanese, Chinese, Hindi, Arabic) hinder model performance.

Existing work often uses machine‑translation‑based data augmentation, translating source‑language data into target languages. However, translation can shift answer spans, preventing direct use of source‑language answer distributions to supervise the target language.

Algorithm Overview

The proposed X-STA method follows three principles: sharing, teaching, and aligning. For sharing, a gradient‑decomposed knowledge‑sharing technique extracts knowledge from parallel language pairs to enrich target‑language understanding while avoiding degradation of source representations. For teaching, an attention mechanism searches the target‑language context for answer spans semantically similar to the source‑language answer, calibrating the output. For aligning, multi‑level alignment further strengthens cross‑lingual transfer.

X-STA model architecture:

Given a context C and question Q, MRC extracts a sub‑sequence as the answer. The input sequence is represented as

where N is the sequence length. We use

and

to denote the start‑ and end‑position probability distributions of the answer. For simplicity, they are concatenated as

, and similarly

represents a one‑hot label for a sequence.

Translate source‑language data into each target language; translate target‑language test data back to the source language.

Each datum contains a question Q and a context paragraph C.

Construct parallel language pairs {source‑train, target‑train} and train the model via back‑propagation.

Feed parallel pairs {source‑test, target‑test} into the model to obtain answer predictions.

Algorithm Accuracy Evaluation

Experiments on three cross‑lingual MRC datasets demonstrate that X-STA significantly improves accuracy.

Module‑wise analysis shows that each component contributes positively to the model.

Open‑Source Release

The source code will be contributed to the EasyNLP framework, inviting NLP practitioners and researchers to use it.

EasyNLP repository: https://github.com/alibaba/EasyNLP

References

Chengyu Wang, Minghui Qiu, Taolin Zhang, et al. EasyNLP: A Comprehensive and Easy‑to‑use Toolkit for Natural Language Processing. EMNLP 2022.

Pranav Rajpurkar et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text. EMNLP 2016.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

natural language processingmachine reading comprehensionKnowledge Transfercross-lingualX-STA
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.