How Pinterest Scaled with MySQL Sharding: A Deep Dive into Their Storage Architecture
Pinterest rebuilt its storage layer using a MySQL‑centric sharding design that meets strict stability, scalability, and performance requirements, detailing business needs, shard key generation, object and mapping tables, expansion strategies, and practical lessons learned over three years of production use.
Business Requirements
The system must be highly stable, easy to operate, and horizontally extensible.
Pins generated by users must be permanently accessible.
Requests for N pins must be returned in a deterministic order (e.g., reverse creation time or custom sorting).
Updates should be fast, while achieving eventual consistency may require additional mechanisms such as distributed transaction logs.
Design Principles and Notes
Because data spans multiple databases, joins and foreign keys cannot be used across shards; only sub‑queries within a single shard are allowed. Load balancing is essential, and moving data piece‑by‑piece is avoided; instead, whole virtual nodes are migrated to physical nodes when necessary.
A simple, reliable solution was needed for rapid prototyping on a distributed data platform where each node remains highly stable.
All data is replicated to slave nodes for high availability and periodically exported to S3 via MapReduce. In production, only the master node is written to; slaves are read‑only and lag behind, making them unsuitable for direct reads.
A globally unique 64‑bit ID (UUID) is generated for every object, encoding shard ID, type ID, and a local auto‑increment ID.
How We Partition Data
We chose mature MySQL as the foundation, avoiding newer NoSQL solutions (MongoDB, Cassandra, Membase) that were deemed insufficiently stable at the time.
Initially we deployed eight EC2 instances, each running a MySQL master‑slave pair. Every MySQL instance hosts multiple databases named db00000, db00001, …, each representing a shard. A configuration table stored the mapping of shards to physical machines, persisted in ZooKeeper.
When a shard needs to move or a machine fails, the configuration is updated and propagated to all MySQL servers.
Object Tables
Object tables store pins, users, boards, comments, etc. Each row contains a local auto‑increment ID and a blob column holding the entire object as JSON.
Creating a new pin involves assembling the JSON blob, determining a shard ID (often the same as the board’s shard), inserting the blob into the object table, retrieving the local ID, and finally constructing the 64‑bit global ID.
Updates are performed in a single MySQL transaction (read‑modify‑write). Deletions can be hard deletes or soft deletes by adding an active flag set to false and filtering on the client side.
Mapping Tables
Mapping tables capture relationships such as board‑to‑pin. Each mapping row contains three columns: from (source ID), to (target ID), and sequence (used for ordering). A composite index on ( from, to, sequence) is created, and rows are sharded by the from object's shard ID.
For reverse lookups, separate tables (e.g., pin_owned_by_board) are created. The sequence field often stores a Unix timestamp to provide monotonic ordering.
Typical query pattern: retrieve up to 50 pin_id s from a board’s mapping table, then fetch the corresponding pin objects.
Scaling Strategies
Three primary scaling methods are used:
Upgrade existing machines (more CPU, RAM, faster disks).
Add new shard ranges: initially only the first 4 K shards (0‑4095) were populated; later additional MySQL servers were introduced to host shards 4096‑8191.
Move existing shards to new machines: for example, split shards 0‑511 from an old server into two new master‑slave pairs, each handling half the range.
Advantages
The UUID generation scheme allows easy creation of globally unique identifiers without needing MySQL ALTER operations. Adding new JSON fields or new mapping tables requires only schema‑level changes in the application code, not costly database migrations.
Mod Shard (Hash‑Based Sharding)
For data that cannot be accessed by ID (e.g., mapping a Facebook ID to a Pinterest ID), a separate hash‑based shard system is used. The shard is computed as shard = md5("1.2.3.4") % 4096, and a configuration file maps each hash bucket to a physical shard.
Final Thoughts
The system has been in production at Pinterest for over three and a half years and continues to serve massive traffic. While it does not guarantee full ACID properties, it provides sufficient reliability and performance for Pinterest’s workloads. Failure recovery relies on ZooKeeper‑stored shard configurations and manual promotion of slaves to masters when a master fails.
Overall, the design demonstrates that a carefully engineered MySQL sharding architecture can meet the demanding scalability and stability needs of a fast‑growing internet service.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
