Database Architecture Evolution and Sharding Practice in Shopee's Logistics Service
Shopee’s Logistics Channel Service migrated from a shared‑cluster MySQL setup to dedicated clusters, added TiDB for transient tracking data, and ultimately adopted hash‑based sharding with separate order and tracking services plus an asynchronous compensation mechanism, enabling uniform data distribution, cross‑database consistency, six‑month archiving, and scalable growth without rebalancing.
This article documents the database architecture evolution of Shopee's Logistics Channel Service (LCS) project, which handles logistics order fulfillment including order creation, tracking updates, and freight synchronization with third-party logistics (3PL) providers.
Background and Challenges: Starting from September 2019, as more 3PL providers entered LCS's scope and e-commerce business grew rapidly in 2020, monthly order volume increased significantly, putting tremendous pressure on databases. The initial shared physical cluster architecture caused I/O resource contention between regions with different order volumes, and tracking push QPS (much higher than order creation) impacted core ordering processes.
Architecture Evolution: The team first split hot-spot region databases to dedicated physical clusters, then introduced TiDB for temporary tracking data to separate transient data from business data. Despite these adjustments, database table volumes continued to explode due to extensive text data (addresses, contact info, product details, tracking descriptions), with disk space in high-volume regions exceeding 30% within months.
Sharding Solution: The team evaluated time-based sharding versus hash-based sharding. Time-based sharding offers easier data archiving but creates uneven data distribution and hot-spot concentration. Hash-based sharding provides uniform distribution and better scalability but complicates historical data migration. After comprehensive consideration of future business growth, stability, extensibility, and data archiving needs, the team chose hash-based sharding.
Application Architecture Split: To address connection limit issues when deploying multiple hash shards per region, the team separated order processing services (requiring MySQL access) from tracking update services (using TiDB), reducing machines to 25% of original count and connection counts to 1/4.
Cross-Database Consistency: Since only partial tables were moved to sharded databases, losing transactional feasibility across databases, the team implemented an asynchronous compensation mechanism using a Checker table (similar to InnoDB undo log) to ensure eventual consistency between mapping tables and order tables.
Maintenance and Scaling: The solution supports data archiving (retaining 6 months of data) and enables scaling without data rebalancing by modifying order ID routing rules, allowing gradual pressure reduction over time.
Shopee Tech Team
How to innovate and solve technical challenges in diverse, complex overseas scenarios? The Shopee Tech Team will explore cutting‑edge technology concepts and applications with you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.