Big Data 9 min read

How Twitter Scaled to Process 400 Billion Events Daily: Architecture Evolution

Twitter processes up to 400 billion events per day, moving from a Lambda‑style architecture with Scalding, Heron, and TSAR to a hybrid Twitter‑Data‑Center and Google Cloud pipeline that delivers sub‑10 ms latency, higher throughput, and lower operational cost while simplifying real‑time aggregation.

dbaplus Community
dbaplus Community
dbaplus Community
How Twitter Scaled to Process 400 Billion Events Daily: Architecture Evolution

Background

Twitter processes up to 400 billion events per day, generating petabytes of data from distributed databases, Kafka, and the internal event bus. Real‑time analytics on this volume require a highly scalable, low‑latency pipeline.

Legacy Architecture

The original system followed a Lambda architecture consisting of three layers:

Batch layer : Ingested client logs, ran Hadoop MapReduce jobs (via Scalding) and stored results in HDFS. Data were later imported into the Summingbird platform for offline analysis.

Speed layer : Consumed real‑time streams from Kafka topics using Apache Heron topologies. Results were cached in the Nighthawk in‑memory store.

Serving layer : Served both batch and speed data through TSAR (TimeSeriesAggregator) services and the TSAR query service, backed by the Manhattan distributed store.

Batch pipelines ran in a single data center and were replicated to two others; real‑time pipelines and query services were deployed across three data centers.

Challenges of the Legacy Design

During traffic spikes (e.g., the FIFA World Cup), Heron bolts experienced back‑pressure, causing spout lag and latency spikes. Restarting Heron containers often resulted in lost events, which corrupted real‑time aggregates. Batch jobs could take several hours, delaying analytics for advertisers and data‑product services. The separation of batch and speed layers also increased operational complexity and cost.

Key Internal Tools

Scalding : A Scala library that builds Hadoop MapReduce jobs via Cascading, simplifying job definition and execution.

Heron : Twitter’s stream‑processing engine. A Heron topology is a directed acyclic graph of spouts (data sources) and bolts (processing nodes).

TimeSeriesAggregator (TSAR) : A scalable framework for real‑time time‑series aggregation, used to compute engagement metrics across dimensions such as device type and interaction type.

New Hybrid Architecture

To eliminate the batch‑speed split and improve latency, Twitter migrated to a hybrid architecture that runs on Twitter data‑center services together with Google Cloud Platform (GCP).

Ingestion : Events are read from Kafka, transformed, and published to Pub/Sub topics.

Real‑time aggregation : GCP Dataflow (or equivalent streaming jobs) consumes Pub/Sub streams, performs per‑dimension aggregation, and writes results to BigTable.

Serving layer : An LDC query service fronts both BigTable and BigQuery, exposing sub‑10 ms query latency while handling millions of events per second. The service scales horizontally under load.

This design removes the separate batch processing component, reduces engineering overhead, and guarantees that every event is accounted for (late‑event counting is supported).

Performance Comparison

Compared with the previous Heron‑based Lambda pipeline, the hybrid pipeline delivers:

Lower end‑to‑end latency (sub‑10 ms versus tens of milliseconds).

Higher sustained throughput (millions of events per second).

Zero event loss thanks to late‑event counting in the streaming job.

Simplified architecture that reduces computational cost.

Conclusion

By moving the TSAR‑based legacy system to a hybrid Twitter‑data‑center and Google Cloud architecture, Twitter can process billions of events in real time with sub‑10 ms latency, high accuracy, improved stability, and lower operational cost.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

StreamingTwitterEvent Processing
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.