Tsinghua University Report Ranks Baidu Wenxin Yiyan First Among Chinese Large Language Models
A Tsinghua University evaluation of seven large language models found Baidu’s Wenxin Yiyan topping the domestic rankings with the highest overall score across 20 metrics—especially Chinese semantic understanding and safety—surpassing ChatGPT and tying GPT‑4, while also demonstrating rapid training, inference speed, and broad industry adoption.
Tsinghua University News & Communication School's Shenyang team released the "Comprehensive Performance Evaluation Report of Large Language Models". The report shows Baidu Wenxin Yiyan achieved the highest overall score among domestic models across 20 indicators, surpassing ChatGPT, and ranking first in Chinese semantic understanding.
Professor Shen Yang said Baidu released Wenxin Yiyan in March, making China join the frontier AI competition, and its capabilities, especially Chinese semantic understanding, are impressive.
The evaluation covered seven large language models: GPT‑4, ChatGPT 3.5, Wenxin Yiyan, Tongyi Qianwen, iFlytek Spark, Claude, and TianGong, assessing generation quality, usage & performance, and safety & compliance across 20 metrics such as context understanding, Chinese semantic understanding, misinformation detection, logical reasoning, content safety, and privacy protection.
In generation quality, Wenxin Yiyan scored 76.98%, second only to GPT‑4 and far ahead of other models, with a 92% score in Chinese semantic understanding, topping the list and exceeding GPT‑4 and iFlytek Spark.
In safety & compliance, Wenxin Yiyan achieved a 78.18% score, tying with GPT‑4 for first place, showing strong content safety, privacy protection, and copyright awareness.
Baidu has built a full AI stack (chip‑framework‑model‑application). Its self‑developed deep‑learning platform PaddlePaddle supports efficient training and inference; the latest 3.5 version of the Wenxin model brings a 50% improvement in effect, 2× faster training, and 30× faster inference.
The model is being applied across industries; Baidu has partnered with State Grid, SPDB, Taikang, Geely and others to launch 11 industry‑specific models. Currently, Wenxin Yiyan has the largest domestic enterprise adoption, with 150 000 companies testing it in more than 400 scenarios.
Baidu Tech Salon
Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.