Designing Scalable Advertising Systems: Architecture, Challenges, and Solutions

This article explains the fundamentals of online advertising business, outlines its technical challenges such as high concurrency, complex logic, and massive data handling, and then details a practical architecture—including data storage, indexing, retrieval, billing, and reporting—offering actionable insights for building robust ad platforms.

dbaplus Community
dbaplus Community
dbaplus Community
Designing Scalable Advertising Systems: Architecture, Challenges, and Solutions

Advertising Business Core

Online advertising generates revenue by charging advertisers for delivering ads to users. The ecosystem involves three parties: advertisers (focus on ROI), platforms (maximize revenue), and users (expect non‑intrusive, relevant ads). A typical CPC revenue model can be expressed as: Revenue = PV × PVR × ASN × CTR × ACP where PV is page views, PVR/ASN are fill rates, CTR is click‑through rate, and ACP is average cost per click.

Technical Challenges

High concurrency : Tens of thousands of QPS with latency requirements of a few dozen milliseconds.

Complex business logic : Each request triggers multi‑stage recall, model scoring, and auction ranking.

High availability : Core services must achieve at least 99.9% uptime.

Big‑data volume : Daily transaction records can reach billions, requiring scalable OLTP and OLAP stores.

Accurate billing : Real‑time charge deduction must avoid loss or duplication.

System Architecture

The architecture targets an early‑stage self‑operated bidding network and consists of the following subsystems:

Ad Delivery System : Manages advertiser accounts, ad inventory, targeting conditions, bidding, and performance monitoring.

Ad Operations Backend : Provides ad‑slot management, strategy configuration, and operational tools for platform operators.

Ad Retrieval Platform : Handles high‑throughput C‑side requests, performs multi‑level candidate recall, model scoring, and auction ranking within tens of milliseconds.

AB Testing Platform : Enables controlled experiments for any strategy change.

Billing Platform : Executes real‑time charge deduction with high availability.

Accounting Center : Centralizes financial operations such as recharge, freeze, and deduction.

Big Data Platform : Aggregates heterogeneous data sources for offline and real‑time analytics, feature generation, and reporting.

Data Storage Strategies

A multimodal storage approach is used:

OLTP : Core tables (ad, creative, member, product, strategy) reside in MySQL with sharding by advertiser ID.

OLAP : Massive reporting tables are stored in HDFS/HBase to support billions of rows.

Indexing : Real‑time forward and inverted indexes are kept in Redis and Elasticsearch.

Synchronization from MySQL to the retrieval index is driven by MQ‑based incremental updates. Only the changed ad ID is sent; the index service reads the latest record from MySQL to avoid out‑of‑order inconsistencies.

Ad Retrieval Platform Workflow

The platform processes C‑side traffic as follows:

Receive request and perform multi‑level recall (Recall layer focuses on algorithmic models).

Score candidates with ranking models (Search layer adds business rules).

Run auction ranking to select the top‑N ads.

Performance optimizations include:

Service layering with horizontal scaling.

Redis caching at multiple tiers to offload DB reads.

Multithreaded parallelism for recall and scoring.

Local hot‑data caches for ad‑slot and strategy configs.

Timeout‑based circuit breaking for non‑critical paths.

Asynchronous execution of auxiliary tasks (e.g., caching click info).

Compact RPC payloads and lean Redis objects.

JVM GC tuning (heap size, collector choice, pause time reduction).

Billing Platform Design (CPC Example)

The real‑time CPC billing pipeline consists of:

Cache click metadata in Redis.

Publish an MQ message for asynchronous processing.

Persist the charge to sharded MySQL tables.

Extract daily aggregates to Hive for eventual consistency and reconciliation.

Fallback mechanisms:

If Redis is unavailable, switch to TiKV for durable persistence.

If MQ delivery fails, process the charge via a thread‑pool worker.

OLAP Reporting Pipeline

The data warehouse follows a layered architecture:

Source Layer : Raw logs from HDFS and incremental extracts from MySQL.

Warehouse Layer : Cleaned dimension and fact tables (wide tables for behavior logs, promotion data, user profiles).

Data Mart Layer : Aggregated tables for ad performance, user behavior, cohort analysis.

Application Layer : End‑user reports, Spark‑generated features, and model inputs.

Implementation details:

Offline analytical queries are served by Apache Kylin on HBase.

Real‑time streams are processed with Flink and Spark Streaming, stored in Druid to support high‑cardinality dimensions and billion‑row tables.

Key Technical Diagrams

Advertising business triangle
Advertising business triangle
Pricing model evolution
Pricing model evolution
Core ad workflow
Core ad workflow
Multimodal storage diagram
Multimodal storage diagram
Index update flow
Index update flow
Retrieval platform flow
Retrieval platform flow
CPC real‑time billing flow
CPC real‑time billing flow
Data warehouse layered structure
Data warehouse layered structure

Conclusion

The presented architecture demonstrates how to balance advertiser ROI, platform revenue, and user experience through scalable backend services, multimodal storage, and robust real‑time billing. It is suitable for an early‑stage self‑operated bidding network; future extensions to RTB or alliance advertising will require additional layers of complexity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendSystem ArchitectureAdvertisinghigh concurrencybilling
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.