Artificial Intelligence 12 min read

Qianfan Large Model Platform: Making Large Models Accessible - Baidu's Latest Work on Model Fine-tuning and Deployment

Baidu’s Qianfan Large Model Platform provides a one‑stop enterprise solution with 54 pre‑installed models, advanced fine‑tuning, comprehensive evaluation metrics, and optimized deployment that cuts costs up to 90% and boosts throughput 3‑5×, enabling rapid, affordable AI application development.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Qianfan Large Model Platform: Making Large Models Accessible - Baidu's Latest Work on Model Fine-tuning and Deployment

This article is based on the keynote speech by Baidu's AI & Big Data Platform General Manager Xinzhou at the 2023 Baidu Cloud Intelligence Conference. It introduces the latest developments in Baidu's Qianfan Large Model Platform, focusing on model fine-tuning and deployment capabilities.

To build successful AI native applications, three key factors are essential: the quality of the base large model, the effectiveness of model tuning based on business data and feedback, and the approach to large model development and application.

Qianfan Large Model Platform is the world's first one-stop enterprise-level large model platform. Since its launch on March 27, it has served over 40,000 users and facilitated nearly 10,000 fine-tuned models. The platform offers 54 pre-installed models, the highest number in China, including ERNIE-Bot 4.0, third-party open-source and closed-source models, and industry-specific models like ChatLaw and Yuxiaoman轩辕 Financial Model.

Key challenges addressed by the platform include: (1) Model fine-tuning difficulties - ensuring reliability while improving specific scenario performance without forgetting general knowledge; (2) Comprehensive model evaluation - addressing the subjective nature of generative AI evaluation; (3) Insufficient computing and inference resources.

The platform has made significant upgrades in efficiency and cost reduction. New SFT support includes Llama2 13B, ChatGLM2-6B, Baichuan 2, Stable Diffusion XL, and SQLCoder. The platform provides data analysis, cleaning, and auto-annotation capabilities. For model evaluation, it introduces metrics including satisfaction, truthfulness, creativity, and comprehensiveness, supporting both automatic and manual evaluation mechanisms.

In a real customer case for online customer service Q&A with 20,000+ conversation pairs, self-built platforms required 600,000 RMB and 100 days, while Qianfan Platform completed the same task in 20 days at only 60,000 RMB - a 90% cost reduction.

For model deployment, the platform uses three optimization approaches: model compression (quantization and sparsification), lossless inference acceleration, and low-cost hardware adaptation. After optimization, deployment costs are reduced by 50-60% in memory usage, with 3-5x throughput improvement.

The platform also introduces model routing mode, which assigns different difficulty problems to different models. In personal assistant scenarios, this approach reduces costs by 30% while maintaining comparable model performance.

inference optimizationModel Deploymentmodel fine-tuningCost Optimizationmodel evaluationAI Native ApplicationsBaidu QianfanLarge Model Platform
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.