Backend Development 11 min read

Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo

This article examines the architecture of Weibo's high‑availability advertising platform, covering match service design with OpenResty, index sharding, business logic optimization, dynamic auto‑scaling, and a real‑time monitoring pipeline to ensure stable, high‑performance ad delivery at massive scale.

High Availability Architecture

Apr 13, 2017

Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo

Internet advertising is a primary revenue source for platforms like Google and Facebook, and with the rise of mobile, feed‑based ads have grown rapidly due to their scalability and native integration with user timelines.

Key characteristics of feed ads include strong inventory growth tied to user engagement and native relevance that improves ad effectiveness while preserving user experience.

Ad system performance is measured by traffic intake, recall rate, and eCPM; achieving high request volume, exposure, and efficiency requires a stable and reliable architecture.

For example, a system handling 1 billion daily requests with a 60% recall rate and 20 CNY eCPM loses roughly 108,000 CNY when timeout rates increase from 99% to 99.9%.

1. Match Service Architecture Selection

The match service, the control center of the ad system, fetches user profiles from a DMP, retrieves candidate ads from indexes, and invokes bidding services. It must handle extremely high concurrency, so Weibo uses OpenResty (Nginx + Lua) to achieve lightweight, high‑performance, parallel processing with coroutine‑based I/O.

Figure 1: System Service Structure

The match service sits in the traffic‑ingress layer, requires parallel handling of many external dependencies, and must adapt quickly to frequent business changes.

2. Index Service Sharding

Index services store ad plans and filter them based on targeting criteria using bitmap, inverted, or relational indexes. To keep latency low as plan counts grow, Weibo shards the index across multiple nodes, allowing parallel queries and merging of results, dramatically reducing per‑node workload.

Figure 2: Index Filtering Funnel

Figure 3: Index Sharding Deployment

3. Business Logic Optimization

By centralizing user‑profile, social‑graph, and click‑behavior data in a DMP, the match service can fetch and distribute resources to downstream modules (e.g., index and relationship services), reducing overall system overhead and supporting rapid feature iteration.

Figure 4: Unified Data Distribution

4. Dynamic Auto‑Scaling

During traffic spikes such as major sports events, the ad system experiences sudden TPS surges. Weibo packages stateless services (match, index, bidding) into Docker images and leverages cloud VM auto‑scaling to expand or shrink capacity in response to real‑time demand, maintaining service stability without manual throttling.

Figure 5: Traffic Spike During Korea‑China Match

Figure 6: Dynamic Scaling Topology

5. Real‑Time Monitoring Platform

Weibo collects logs with Flume, streams them to Kafka, and processes them using an ELK stack (later Grafana) to provide live metrics such as request volume, exposure, timeout rate, and recall rate. The platform supports heterogeneous log formats via configurable Logstash pipelines, enabling rapid alerting and capacity planning.

Figure 7: Real‑Time Monitoring Platform

By improving system performance, optimizing business logic, and building robust infrastructure, the advertising platform achieves both high throughput and rapid iteration capability, supporting Weibo’s continued commercial growth.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Advertising Backend Architecture high availability real-time monitoring scaling OpenResty

Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.