How Pinterest Scaled to Billions of Page Views: Architecture & Sharding Secrets
Pinterest grew from zero to over 100 billion monthly page views by rapidly expanding its infrastructure, adopting simple, mature technologies like MySQL, Redis, and Memcache, and transitioning from clustering to sharding, offering practical lessons on scaling, tool selection, ID design, and migration strategies for massive growth.
Introduction
Pinterest experienced exponential growth, doubling roughly every six weeks. In two years it went from 0 to 100 billion monthly page views, from two founders and one engineer to forty engineers, and from a single MySQL server to hundreds of web, API, MySQL, Redis, and Memcache instances.
Key Takeaways
Strong architecture handles growth by simply adding identical servers while preserving correctness.
When performance limits are reached, choose mature, simple, widely‑used tools (MySQL, Solr, Memcache, Redis) and avoid complex or immature technologies.
Basic Concepts
Pins are images linked to content.
Pinterest is a social network of users, boards, and pins.
The database stores boards, pins, users, and authentication data.
Timeline of Growth
Early 2010 (Self‑Discovery Phase)
Small team, Rackspace servers, a single web engine and MySQL instance.
January 2011
Amazon EC2 + S3 + CloudFront
1 Nginx, 4 web engines (redundancy)
1 MySQL master + 1 read replica
Task queue with two workers
MongoDB for counting
Two engineers
September 2011 (Rapid Growth)
Infrastructure exploded, adding many servers and diverse technologies, many of which later failed.
Amazon EC2 + S3 + CloudFront
2 Nginx, 16 web engines, 2 API engines
5 functional MySQL shards + 9 read replicas
4 Cassandra nodes
15 Membase nodes (3 clusters)
8 Memcache nodes
10 Redis nodes
3 task routers + 4 processors
4 Elasticsearch nodes
3 Mongo clusters
3 engineers
January 2012 (Mature Architecture)
Redesigned system:
Amazon EC2 + S3 + Akamai + ELB
90 web engines + 50 API engines
66 MySQL db.t2.large instances (each with a slave)
59 Redis instances
51 Memcache instances
1 Redis task manager + 25 processors
Sharded Solr
Later expanded to:
180 web engines + 240 API engines
88 MySQL cc2.8xlarge instances (each with a slave)
110 Redis instances
200 Memcache instances
4 Redis task managers + 80 processors
40 engineers
Why Amazon EC2/S3?
Reliability and rapid provisioning of new instances.
Strong support and peripheral services (load balancers, caching, etc.).
Ease of scaling without capacity planning for small teams.
Why MySQL?
Mature, durable, and widely known.
Linear request rate scaling.
Rich tooling (XtraBackup, Innotop, Maatkit) and community support.
Open‑source with strong vendor backing (Percona).
Why Memcache?
Simple hash‑table cache, highly mature.
Excellent performance, widely adopted, never crashes, free.
Why Redis?
Simple yet powerful data structures.
Supports persistence and replication with flexible options.
Good performance, widely liked, free.
Solr
Quick to install and use.
At the time, limited to a single node.
Considered Elasticsearch but faced scaling concerns.
Cluster vs. Sharding
Pinterest found that automatic clustering introduced many failure points and complexity, while sharding offered simpler, more reliable scaling.
Cluster Characteristics
Automated data distribution, rebalancing, and high availability.
Complex upgrade paths and numerous SPOF risks.
Sharding Advantages
Manual data placement reduces movement and failures.
Easy to add capacity by creating new shards.
Simple ID‑based routing.
Sharding Implementation
Adopted a 64‑bit ID: 16‑bit shard ID, 10‑bit type, remaining bits for auto‑incremented local ID.
Users are randomly assigned to shards; all of a user’s data resides in the same shard.
Started with 4 096 shards, scalable up to 65 536.
Lookup Process
Application uses a configuration mapping shard ranges to physical hosts, decomposes the 64‑bit ID, finds the shard, and queries the appropriate MySQL instance.
Object and Mapping Model
All data stored as objects (pins, boards, users) or mappings (user‑board, pin‑like).
Objects serialized as JSON blobs, later migrated to Thrift.
Mappings use naming convention noun_verb_noun (e.g., user_likes_pins).
Rendering a User Profile
Steps:
Extract username from URL and fetch user ID from a giant user table.
Decompose the ID to locate the correct shard.
Query the shard for user data, boards, and pins, using cache (Memcache/Redis) to minimize DB hits.
SELECT body FROM users WHERE id = <local_user_id>;
SELECT board_id FROM user_has_boards WHERE user_id=<user_id>;
SELECT body FROM boards WHERE id IN (<boards_ids>);
SELECT pin_id FROM board_has_pins WHERE board_id=<board_id>;
SELECT body FROM pins WHERE id IN (pin_ids);Script Migration
During sharding transition, scripts acted as bridges between the old monolithic system and the new sharded architecture, moving billions of rows over several months.
Future Directions
Move toward a service‑oriented architecture to isolate functionality and reduce cross‑service connections.
Deploy dedicated services (e.g., follow service) to limit database and cache load.
Lessons Learned
Keep architecture simple to handle rapid growth.
Choose mature, well‑supported tools.
Design IDs that embed shard information to avoid data movement.
When scaling, distribute load evenly across shards.
Service‑oriented design improves stability, team organization, and security.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
