Tagged articles
2 articles
Page 1 of 1
DataFunTalk
DataFunTalk
Jul 18, 2025 · Artificial Intelligence

How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs

Alibaba International’s senior data science expert explains a systematic five‑strategy solution—data acquisition, augmentation, quality optimization, engineering pipeline, and evaluation loop—to overcome data scarcity, high annotation cost, and processing challenges for low‑resource languages in multilingual large language models.

AIModel Evaluationdata engineering
0 likes · 13 min read
How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs
DataFunSummit
DataFunSummit
Jul 13, 2025 · Artificial Intelligence

How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs

In this interview, Alibaba International’s senior data‑science expert Li Haijun explains the challenges of low‑resource languages for multilingual large models and details a five‑step data‑collection, augmentation, quality‑optimization, engineering, and evaluation framework that powers their cross‑border e‑commerce AI applications.

AIlarge language modelslow-resource languages
0 likes · 12 min read
How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs