Artificial Intelligence 9 min read

Baidu Unveils PLATO-XL: A 110‑Billion‑Parameter Bilingual Dialogue Generation Model

Baidu's newly released PLATO‑XL, a 110‑billion‑parameter bilingual pre‑training dialogue model, surpasses previous large‑scale models, introduces multi‑role awareness for consistent multi‑turn conversations, and demonstrates state‑of‑the‑art performance across open‑domain, knowledge‑grounded, and task‑oriented dialogue tasks.

DataFunTalk

Sep 22, 2021

Baidu Unveils PLATO-XL: A 110‑Billion‑Parameter Bilingual Dialogue Generation Model

What does a barrier‑free conversation with AI feel like? This article explores that experience by introducing Baidu's latest dialogue generation model, PLATO‑XL.

Recently Baidu announced PLATO‑XL, a 110‑billion‑parameter bilingual (Chinese‑English) dialogue model, making it the current largest open‑domain conversational model and surpassing Facebook's Blender.

PLATO‑XL is the world's first pre‑trained dialogue model with over 100 billion parameters for both Chinese and English, further advancing open‑domain dialogue capabilities.

Although many large‑parameter models have emerged in natural language processing, challenges such as proactive behavior and commonsense reasoning in multi‑turn open‑domain dialogue remain unresolved.

Baidu's NLP team first released the PLATO model in October 2019, showcased at ACL 2020, and upgraded to the 1.6‑billion‑parameter PLATO‑2 in 2020, covering both Chinese and English.

The newly released PLATO‑XL pushes the parameter count to 110 billion, becoming the largest bilingual dialogue generation model to date.

Paper: PLATO‑XL: Exploring the Large‑scale Pre‑training of Dialogue Generation – https://arxiv.org/abs/2109.09519

Achieving human‑like logical, knowledgeable, and emotional conversations is a core challenge for intelligent interaction; open‑domain dialogue is essential for empathetic companions, assistants, and other AI applications.

Pre‑training dramatically improves a model's ability to learn from massive unlabeled data, making efficient utilization of large corpora a primary research direction.

From Google's Meena and Facebook's Blender to Baidu's PLATO series, open‑domain dialogue quality has steadily improved, with PLATO‑2 achieving top rankings in five DSTC‑9 tasks.

PLATO‑XL inherits the unified transformer architecture, enabling joint modeling of dialogue understanding and response generation with high parameter efficiency. Its flexible attention mechanism encodes context bidirectionally and decodes responses autoregressively, while the unified transformer reduces padding waste and speeds up training.

To mitigate contradictory responses, PLATO‑XL introduces multi‑role‑aware input representations, distinguishing speakers in multi‑party social media dialogues and producing more coherent, consistent replies.

The model is trained on a corpus of trillions of tokens, reaching 110 billion parameters, and is built entirely on Baidu's PaddlePaddle platform using FleetX's recompute and sharded data parallelism strategies on a high‑performance GPU cluster.

Extensive evaluations show PLATO‑XL outperforms open‑source bilingual models such as Blender, DialoGPT, and Tsinghua's EVA in self‑chat tests, and also surpasses commercial chatbots across multiple dialogue tasks.

Beyond open‑domain chit‑chat, PLATO‑XL excels in knowledge‑grounded and task‑oriented dialogues, achieving leading performance on a variety of benchmarks.

Scaling experiments reveal a stable positive correlation between model size (from 93 million to 110 billion parameters) and dialogue quality.

In both English and Chinese multi‑turn conversations, PLATO‑XL can engage users with logical, content‑rich, and entertaining dialogues.

Conclusion: Enabling natural language interaction is a fundamental AI goal; PLATO‑XL represents a significant step in large‑scale open‑domain dialogue research, and future models are expected to become even more human‑like and knowledgeable.

Developers can try the latest Chinese PLATO‑100B model via Baidu's open API; interested parties can contact [email protected] or explore Baidu Brain UNIT for more details.

English demo: https://nlp.baidu.com/special/plato/englishDemo

Chinese demo: scan the QR code for the "Baidu PLATO" WeChat public account.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

natural language processing Large Language Model bilingual AI dialogue generation PLATO-XL

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.