Why UUID Falls Short and How Snowflake Solves Distributed ID Generation
The article examines the limitations of UUIDs for distributed systems, outlines the strict requirements for global unique IDs, compares common approaches such as database auto‑increment and Redis, and provides a detailed analysis of Twitter's Snowflake algorithm with its structure, Java implementation, advantages, drawbacks, and mitigation strategies.
Problem
In large distributed systems massive amounts of data and messages must be uniquely identified. Examples include finance, payment, catering and hotel services at Meituan‑Dianping, growing data sets after sharding in movie platforms such as Maoyan, and entities like orders, riders and coupons that each require a unique identifier. A system that can generate globally unique IDs is therefore essential.
Hard requirements for ID generation
Global uniqueness – no duplicate IDs may appear.
Trend‑increasing – ordered primary keys (e.g., B‑Tree indexes in MySQL InnoDB) improve write performance.
Monotonic increase – the next ID must be larger than the previous one for use cases such as transaction version numbers.
Information security – unpredictable IDs prevent competitors from inferring order volumes.
Timestamp inclusion – embedding a timestamp enables quick identification of when an ID was generated.
Availability requirements
High availability – 99.999% of ID requests must return a result.
Low latency – responses must be fast.
High QPS – the system should handle at least 100,000 ID requests per second.
Common solutions
UUID
JDK‑provided 36‑character string formatted as 8‑4‑4‑4‑12.
Pros: high performance, generated locally, no network cost.
Cons: unordered, long string, increases database index pressure; MySQL recommends short primary keys.
Because UUIDs are unordered, each insertion modifies the B‑tree index heavily, causing node splits and many under‑filled nodes, which degrades insert performance.
Database auto‑increment primary key
Implemented via REPLACE INTO, which inserts a row or replaces the existing one on unique‑key conflict.
Horizontal scaling is difficult; adding a new machine requires manually setting a large initial offset, which becomes impractical with dozens or hundreds of nodes.
Each ID request incurs a read‑write round‑trip to the database, violating low‑latency and high‑QPS requirements.
Redis global ID strategy
Redis guarantees atomicity with single‑threaded INCR / INCRBY operations.
In a 5‑node Redis cluster, using a step size of 5 and initial values 1‑5 yields the following sequences:
A: 1, 6, 11, 16, 21
B: 2, 7, 12, 17, 22
C: 3, 8, 13, 18, 23
D: 4, 9, 14, 19, 24
E: 5, 10, 15, 20, 25
Snowflake
Overview
Twitter’s distributed incremental ID algorithm. Repository: https://github.com/twitter-archive/snowflake
Time‑ordered generation.
64‑bit integer (max 19‑digit string).
No ID collisions across datacenter and worker IDs; high efficiency.
Structure
Bit layout (sign bit = 0):
41‑bit timestamp (0‑2^41, about 69 years from 1970).
5‑bit datacenter ID and 5‑bit worker ID (total 10 bits, up to 1024 nodes).
12‑bit sequence number (0‑4095) for IDs generated within the same millisecond.
Reference implementation (Java)
public class SnowflakeIdWorker {
private final long twepoch = 1598598185157L; // start epoch
private final long workerIdBits = 5L;
private final long datacenterIdBits = 5L;
private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private final long sequenceBits = 12L;
private final long workerIdShift = sequenceBits;
private final long datacenterIdShift = sequenceBits + workerIdBits;
private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private final long sequenceMask = -1L ^ (-1L << sequenceBits);
private long workerId;
private long datacenterId;
private long sequence = 0L;
private long lastTimestamp = -1L;
public SnowflakeIdWorker(long workerId, long datacenterId) {
if (workerId > maxWorkerId || workerId < 0) {
throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
}
if (datacenterId > maxDatacenterId || datacenterId < 0) {
throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
}
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
long id = ((timestamp - twepoch) << timestampLeftShift)
| (datacenterId << datacenterIdShift)
| (workerId << workerIdShift)
| sequence;
return id;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
public static void main(String[] args) {
SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0, 0);
for (int i = 0; i < 1000; i++) {
long id = idWorker.nextId();
System.out.println(id);
}
}
}Advantages
Timestamp in high bits makes the ID trend‑increasing.
Low‑order sequence provides uniqueness within the same millisecond.
No reliance on databases or third‑party services; high stability and generation performance (~260,000 IDs per second).
Flexible bit allocation allows adaptation to business needs.
Disadvantages
Depends on machine clocks; clock rollback can cause duplicate IDs.
In a distributed environment clocks are not perfectly synchronized, so global monotonicity is not guaranteed, though trend‑increasing satisfies most scenarios.
Mitigation
Clock synchronization can be achieved with open‑source solutions such as Baidu’s UidGenerator or Meituan‑Dianping’s Leaf.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
