Baidu’s QCon 2021 Highlights: Elastic Scaling, Search Architecture, AI Chips
This article compiles Baidu engineers' QCon 2021 talks, covering micro‑service evolution, large‑scale container elastic scaling, search system elasticity, AI‑chip deployment at massive scale, and cost‑focused monitoring, each with abstracts, outlines and key takeaways for practitioners.
Baidi Large‑Scale Container Orchestration Elastic Scaling
Abstract: Driven by micro‑services and cloud‑native trends, Baidu explores how to optimize resource efficiency and overall cost for massive container fleets, presenting a data‑driven elastic scaling framework, automated policies, and event‑driven mechanisms that support million‑scale adjustments across products such as Search, Feed, and Baidu App.
Background & Trends
Micro‑services and cloud‑native momentum
Service governance and resource cost challenges
Elastic Mechanism Technology Selection
Industry research: open‑source vs. self‑built trade‑offs
Implementation path and cross‑team collaboration
Baidi Elastic Scaling System Design
Framework design: goals, layers, principles
System components: data collection, automated policies, event‑driven engine
Advanced strategies: traffic scheduling, premium container placement, time‑sharing reuse
Business Elastic Scaling Practices
Multi‑level elasticity for diverse scenarios
Extreme elasticity (Serverless) for high‑speed demands
Impact on machine cost, resource efficiency, service stability, business metrics
Lessons learned from real‑world deployments
Summary & Outlook
Broader business scenario adoption
Future service governance roadmap
From Storage to Compute: Extreme Elasticity in Baidu Search Middleware
Abstract: Baidu Search middleware handles billions of daily queries across diverse scenarios. The existing micro‑service architecture reached limits in efficiency and cost, prompting a system‑wide elasticity approach that decouples data distribution, compute orchestration, and service topology, achieving up to 30% machine‑cost savings and halving human effort.
Search Middleware Overview
Architecture with ~20 micro‑service modules covering content computation to online retrieval
Current automation and scaling capabilities
Challenges of Complex Heterogeneous Workloads
Business delivery flow constraints
Adaptation to evolving demand
Storage Elasticity Mechanisms
Data grouping, allocation, migration strategies
Intelligent data governance
Content Compute Elasticity
Adaptive data freshness guarantees
Smart function orchestration (FaaS)
Compute demands in search scenarios
Super‑Automation Delivery via Elastic Capability
Unified demand‑to‑operation workflow
Hyper‑automated delivery
Future Outlook
Standardized demand understanding
Low‑code platform for search
Large‑Scale Search Model Architecture Optimization
Abstract: Deploying massive deep‑learning models for Baidu Search on heterogeneous accelerators (GPU, Kunlun chips) incurs high operational costs. This talk details the architecture of large‑scale online models and several optimization practices, including lossless architectural refinements and offline compression techniques, aiming to balance performance gains with cost control.
Business and architecture evolution of large‑scale search models
"No‑Flaw" – lossless architectural optimization
"Tian‑Gong" – offline compression optimization
Future directions: architecture‑driven model improvements
Cloud‑AI Chip Massive Deployment at Baidu
Abstract: With AI chips becoming pivotal for inference workloads, Baidu showcases the Kunlun chip’s technical characteristics and large‑scale deployment in data‑center inference scenarios, highlighting end‑to‑end performance tuning, efficient mixed‑workload handling, and practical lessons from production.
AI chip background and industry trends
Kunlun architecture and key features
Large‑scale deployment experiences
Future work and roadmap
Cost‑Optimized Large‑Scale Microservice Monitoring
Abstract: Micro‑services increase system complexity, demanding observability solutions that can scale to billions of requests without prohibitive cost. Baidu shares the design of the Fengjing monitoring platform deployed across advertising and content services, emphasizing low‑cost data collection, cheap compute/storage, and minimal operational overhead, while still delivering comprehensive insights.
Monitoring Demands in High‑Volume Multi‑Business Scenarios
Differences between business‑centric and traditional monitoring
Complexities of inter‑linked subsystems
Cost‑Driven Technical Considerations
Balancing cost versus monitoring capability
Black‑Techniques for Extreme Cost Optimization
Non‑intrusive probe technology
Low‑cost data analysis and topology computation
Weave‑in circuit‑breaker and rate‑limiting methods
Holistic Monitoring Governance Outlook
Integrated monitoring governance vision
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
