Databases 15 min read

Design and Implementation of a High‑Throughput Payment System: Sharding, Order ID Generation, and High Availability

The article details how LeTV upgraded its payment platform in 2015 to handle up to 100,000 orders per second by implementing database sharding based on user IDs, a Snowflake‑style globally unique order ID, asynchronous consistency via message queues, multi‑level data caching, and high‑availability architectures for both master‑slave and load‑balanced deployments.

Architecture Digest
Architecture Digest
Architecture Digest
Design and Implementation of a High‑Throughput Payment System: Sharding, Order ID Generation, and High Availability

In November 2015 LeTV rebuilt its payment system to sustain a stable throughput of 100,000 orders per second, addressing the massive request surge caused by flash‑sale events.

Sharding (分库分表) : The order table is partitioned by user ID (uid) using a binary‑tree expansion strategy. Eight databases (DB1‑DB8) each contain ten tables (order_0‑order_9). The routing formulas are: database_id = (uid / 10) % 8 + 1 table_id = uid % 10

Order ID Generation : A Snowflake‑inspired algorithm produces globally unique IDs without database sequences. An ID consists of a millisecond‑level timestamp, a machine identifier, a per‑millisecond sequence, a version byte, and embedded sharding information (database and table numbers). The sharding part is stored as a two‑character string to allow future expansion (up to 64 databases) using: shard_info = (uid / 10) % 64 + 1 The actual database is then derived by: real_db_id = (shard_info - 1) % 8 + 1

Final Consistency : To support queries by both uid and merchant ID (bid), a duplicate order‑table cluster is maintained for the bid dimension. Asynchronous data synchronization is achieved with a message queue, while a real‑time monitoring service detects and resolves inconsistencies.

Database High Availability : Classic master‑slave topology is enhanced with LVS for read‑only slaves and KeepAlive virtual IP for automatic master failover. When the primary DB fails, a backup DB is promoted instantly, and the virtual IP is remapped, minimizing downtime to seconds.

Data Tiering : The system classifies data into three levels: (1) core order and transaction data (no cache, direct DB access), (2) user‑related data (cached in Redis), and (3) configuration data (cached in local memory). A high‑availability message‑push platform propagates configuration changes to all servers to keep the in‑memory cache consistent.

Coarse‑Fine Pipeline : An upstream “coarse” gate limits incoming traffic to 1 million requests per second, discarding excess, while a downstream “fine” gate throttles traffic to the web cluster at 100,000 requests per second. Excess requests are queued in the pipeline, preventing overload. The mechanism can be realized with the commercial edition of Nginx using the max_conns directive.

distributed systemsShardinghigh availabilitydatabasesPayment Systemorder ID
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.