How Baidu Scaled Its Vertical Search: Elastic Scheduling and Data Management Secrets
This article explains how Baidu's vertical search platform tackled massive data growth and scaling challenges by redesigning its data management system, introducing elastic scheduling, decoupling ETCD access, implementing auto‑scaling, and advancing shard expansion to improve performance, stability, and cost efficiency.
Background
Baidu's vertical search system powers specialized results, vertical domain search, and in‑app search, handling hundreds of retrieval scenarios and billions of documents. As business count and data volume grew, the system faced new challenges in massive data management and scheduling.
Current Architecture Issues
The recall engine uses a heterogeneous deployment model, offering better capacity auto‑adjustment and on‑demand storage than homogeneous setups, but supporting over 80 businesses with hundreds of services and thousands of indexes has exposed several problems:
Metadata management bottleneck: direct ETCD connections cause severe read/write amplification and become a single‑point bottleneck.
Manual resource estimation: onboarding new services or handling large events relies on human estimates, leading to over‑provisioning or overload.
Data‑volume bottleneck: shard count can only increase multiplicatively, limiting growth for large indexes.
Search System Architecture Overview
The system consists of three main modules:
RANK : query understanding, request construction, multi‑queue splitting, forward‑index retrieval, scoring, and result assembly.
BS : basic recall engine using term inverted lists and ANN vectors.
BUILD : data processing, tokenization, and generation of forward, inverted, vector, and summary indexes.
Each vertical business has an independent set of these services, while the data management system provides instance scheduling, capacity management, service discovery, heartbeat handling, and routing control for the recall engine.
Dynamic Data Management System
The system includes a central control service, heartbeat service, naming service, and ETCD storage. Key functions:
Resource onboarding/offboarding and replica management.
Replica health‑check and automatic scaling.
Capacity management based on load, adjusting replica counts and shard numbers.
Availability control ensuring safe instance restarts.
Naming Service (NS) Design
NS acts as a read‑through cache for ETCD, providing topology information to RANK without direct ETCD access. It is stateless, guarantees eventual consistency, and returns MD5 hashes and timestamps to let RANK decide when to refresh.
Three NS instances can serve all business topology queries, reducing ETCD read traffic by 90%.
Heartbeat Service (HS) Design
HS aggregates heartbeat data from BS instances and writes it to ETCD, also returning the latest consumption shard information to BS. It uses a leaderless design with consistent hashing, ensuring each shard is written once per cycle and reducing write contention by 80%.
Automatic Scaling
To achieve capacity‑adaptive adjustments, an auto‑scaling service periodically calculates load per resource (CPU, QPS, latency) and triggers the control service to adjust replica counts or PaaS instance numbers. Expansion prefers idle instances; shrinking reclaims idle replicas before triggering PaaS down‑scaling.
<code>enum LoadStatus {
LOAD_STATUS_LOAD_OK = 0; // normal load
LOAD_STATUS_OVERLOAD = 1; // overload
LOAD_STATUS_IDLELOAD = 2; // low load
LOAD_STATUS_BS_ADD_REPLICA = 3; // adding replica
LOAD_STATUS_BS_REMOVE_REPLICA = 4; // removing replica
LOAD_STATUS_TRIGGER_PAAS_EXPENSION = 5; // PaaS scaling up
LOAD_STATUS_TRIGGER_PAAS_SHRINK = 6; // PaaS scaling down
}</code>Advanced Shard Expansion
When data volume grows, shard numbers must increase. The original scheme doubled shard count while halving slots per shard, but faced exponential growth limits and slot‑allocation constraints. The new approach rebuilds shards in a new partition range, copying data from old shards to new ones, then switches traffic via service discovery, allowing seamless migration without data loss.
Summary and Outlook
The optimizations reduced ETCD load by an order of magnitude, increased cluster scale, lifted shard replica limits, and improved CPU utilization by over 15% while cutting manual interventions by 80%. Future work includes automatic shard‑count scaling based on data volume and further cost‑effective storage‑performance improvements for large‑scale indexes.
Baidu Tech Salon
Baidu Tech Salon, organized by Baidu's Technology Management Department, is a monthly offline event that shares cutting‑edge tech trends from Baidu and the industry, providing a free platform for mid‑to‑senior engineers to exchange ideas.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.