Designing a Scalable Twitter System Architecture: Requirements, Service Overview, and Detailed Microservice Design
This article outlines how to design Twitter from scratch using a microservice architecture, detailing functional and non‑functional requirements, service decomposition, scalability calculations, database schemas, and the interaction of components such as tweet, timeline, fan‑out, social‑graph, and search services.
Twitter is a global social networking service; when asked to design its system in an interview, candidates should propose a microservice‑based distributed architecture rather than a monolithic design.
Functional requirements
Users can post or share new tweets (max 140 characters).
Users can delete tweets but cannot edit them.
Users can like tweets.
Users can follow or unfollow other users, affecting their timeline.
Two types of timelines are generated: a personal timeline and a home timeline showing followed users' tweets.
Keyword‑based tweet search.
Account creation and deletion (using an external identity service).
Support for text‑only tweets in this design.
Analytics/monitoring services to assess load, health, and provide recommendations.
Non‑functional requirements
High availability so users never notice downtime.
Timeline generation must complete within 0.5 seconds.
Eventual consistency is acceptable; a keyword database can be used for search.
Scalability to handle growing users and tweets.
Persistent storage of user data.
Estimated traffic calculations:
Average daily active requests: 150M × 60 / 86400 ≈ 100k req/s.
Peak concurrent users ≈ 300k.
Maximum three‑month peak ≈ 600k.
Read QPS ≈ 300k, Write QPS ≈ 5k.
Service overview
Tweet Service
User Timeline Service
Fan‑out Service
Home Timeline Service
Social Graph Service
Search Service
Each microservice consists of an application server, distributed cache, and a backend database (or NoSQL store for media). The Tweet Service receives posts, generates a unique tweet ID (e.g., UUID), stores the tweet in a distributed cache and the tweet table, and forwards it to other services.
The User Timeline Service returns a user's own tweets in reverse chronological order, using a list of tweet IDs stored in memory; it does not directly query a database.
The Fan‑out Service asynchronously distributes new tweets to followers' timelines, the search index, and other components via distributed queues, ensuring eventual consistency.
The Home Timeline Service merges tweets from followed users, applying weighting and pruning when the number exceeds a configurable limit K (default 1000).
The Social Graph Service implements the Following API, tracking follow relationships in a dedicated table and handling asynchronous updates when users follow or unfollow.
The Search Service ingests tweets, performs stemming, builds a reverse index, and serves keyword queries through a Blender component.
Database schema highlights include Users , Tweet , and Favorite_tweet tables, with sharding strategies based on user ID, tweet ID, or both to achieve scalability.
Scalability design uses partitioned and replicated caches/databases, with O(1) insertion into timelines and configurable retention policies.
Overall, the architecture demonstrates how to decompose Twitter into independent, horizontally scalable microservices while meeting latency, availability, and consistency goals.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.