Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo
This article examines the architecture of Weibo's high‑availability advertising platform, covering match service design with OpenResty, index sharding, business logic optimization, dynamic auto‑scaling, and a real‑time monitoring pipeline to ensure stable, high‑performance ad delivery at massive scale.
Internet advertising is a primary revenue source for platforms like Google and Facebook, and with the rise of mobile, feed‑based ads have grown rapidly due to their scalability and native integration with user timelines.
Key characteristics of feed ads include strong inventory growth tied to user engagement and native relevance that improves ad effectiveness while preserving user experience.
Ad system performance is measured by traffic intake, recall rate, and eCPM; achieving high request volume, exposure, and efficiency requires a stable and reliable architecture.
For example, a system handling 1 billion daily requests with a 60% recall rate and 20 CNY eCPM loses roughly 108,000 CNY when timeout rates increase from 99% to 99.9%.
1. Match Service Architecture Selection
The match service, the control center of the ad system, fetches user profiles from a DMP, retrieves candidate ads from indexes, and invokes bidding services. It must handle extremely high concurrency, so Weibo uses OpenResty (Nginx + Lua) to achieve lightweight, high‑performance, parallel processing with coroutine‑based I/O.
Figure 1: System Service Structure
The match service sits in the traffic‑ingress layer, requires parallel handling of many external dependencies, and must adapt quickly to frequent business changes.
2. Index Service Sharding
Index services store ad plans and filter them based on targeting criteria using bitmap, inverted, or relational indexes. To keep latency low as plan counts grow, Weibo shards the index across multiple nodes, allowing parallel queries and merging of results, dramatically reducing per‑node workload.
Figure 2: Index Filtering Funnel
Figure 3: Index Sharding Deployment
3. Business Logic Optimization
By centralizing user‑profile, social‑graph, and click‑behavior data in a DMP, the match service can fetch and distribute resources to downstream modules (e.g., index and relationship services), reducing overall system overhead and supporting rapid feature iteration.
Figure 4: Unified Data Distribution
4. Dynamic Auto‑Scaling
During traffic spikes such as major sports events, the ad system experiences sudden TPS surges. Weibo packages stateless services (match, index, bidding) into Docker images and leverages cloud VM auto‑scaling to expand or shrink capacity in response to real‑time demand, maintaining service stability without manual throttling.
Figure 5: Traffic Spike During Korea‑China Match
Figure 6: Dynamic Scaling Topology
5. Real‑Time Monitoring Platform
Weibo collects logs with Flume, streams them to Kafka, and processes them using an ELK stack (later Grafana) to provide live metrics such as request volume, exposure, timeout rate, and recall rate. The platform supports heterogeneous log formats via configurable Logstash pipelines, enabling rapid alerting and capacity planning.
Figure 7: Real‑Time Monitoring Platform
By improving system performance, optimizing business logic, and building robust infrastructure, the advertising platform achieves both high throughput and rapid iteration capability, supporting Weibo’s continued commercial growth.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
