Why Do the Last Six Digits of Taobao Order IDs Remain Constant?

The article explains how Taobao embeds a user‑specific “gene” in the last six digits of its order numbers, enabling efficient sharding routing, uniform data distribution, and idempotent order handling while maintaining global uniqueness in a large e‑commerce system.

Java Backend Full-Stack
Java Backend Full-Stack
Java Backend Full-Stack
Why Do the Last Six Digits of Taobao Order IDs Remain Constant?

Essence of the last six digits: user‑gene anchoring

Observing Taobao order numbers such as 287654321098765432 shows that the last six digits (e.g., 765432) are usually fixed for the same user. The six digits are the hash of the user ID or a fixed segment after user‑ID desensitization, binding the user to the storage location so that all orders of a user are routed to the same shard.

Core challenges of order‑ID generation under sharding

Traditional order IDs (auto‑increment or pure Snowflake) lack business‑specific genes, causing two problems:

Low routing efficiency – querying all orders of a user requires scanning every database and table because the order ID cannot locate the data.

Uneven data distribution – random sharding can create hotspots where some shards hold far more data.

The gene‑method order ID solves both by embedding a “sharding gene” (user identifier) into the ID.

Structure design of the gene‑method order ID (Taobao reference)

The core is segmented encoding, each segment representing a specific gene. A typical 20‑digit Taobao order ID follows:

[TimeGene(6)] + [BusinessTypeGene(2)] + [RandomSeqGene(6)] + [ShardingRoutingGene(6)]

Example parsing (order ID 24081501123456100860)

Time gene : 240815 → 2024‑08‑15

Business type gene : 01 → normal order

Random sequence gene : 123456 → guarantees uniqueness

Sharding routing gene : 100860 → hash of the user ID, fixed for the same user

Sharding routing logic

Sharding rule: use the routing gene as the sharding key. Example: 100860 % 8 = 4 selects database 4; then 100860 % 16 = 8 selects table 8.

Advantage: querying all orders of user 10086 directly computes the target database and table, achieving O(1) location without cross‑shard scans.

Core value of the gene‑method order ID in sharding

Maximum routing efficiency – direct calculation of storage location replaces full‑shard traversal.

User order aggregation optimization – orders of the same user concentrate in a few shards, eliminating costly cross‑shard joins for “my orders” queries.

Uniform and controllable data distribution – the hash‑based routing gene distributes data evenly, preventing hotspot shards and supporting dynamic scaling by adjusting the number of shards.

Duplicate submission prevention and idempotency – the combination of time gene, business type gene, and routing gene (14 digits) can identify repeat submissions, while the random sequence ensures each order ID remains unique.

Conclusion

The gene‑method turns the order ID into a navigation instrument for data routing. By embedding a sharding routing gene (e.g., Taobao’s fixed last six digits), the design tightly binds order data to the user dimension, solves routing efficiency in sharded databases, and preserves global uniqueness, making it a mainstream solution for large e‑commerce platforms with strong user‑order relationships.

e-commerceTaobaoshardingdatabase routingorder IDgene method
Java Backend Full-Stack
Written by

Java Backend Full-Stack

Provides technical guidance, interview coaching, and tech sharing. Follow and reply '77' to receive our self-made 'Interview Cheat Sheet' and interview resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.