Choosing the Right Distributed ID Generation Strategy: UUID, Snowflake, Redis, and More
This article compares various distributed unique identifier generation methods—including UUID, database auto‑increment, Redis INCR, Zookeeper, segmented DB caches, and the Snowflake algorithm—detailing their principles, advantages, drawbacks, and suitable use‑cases to help developers select the optimal solution for their systems.
In many business scenarios a unique identifier is required to record data such as user IDs, order numbers, or message IDs.
1. Background
Various generation strategies are surveyed, and their pros and cons are discussed.
2. UUID
UUID (Universally Unique Identifier) is defined by the OSF and can be generated using elements such as MAC address, timestamp, namespace, random or pseudo‑random numbers.
In Java the java.util.UUID class supports four generation strategies:
randomly – based on pseudo‑random numbers.
time‑based – combines current time, random number and MAC address.
DCE security – DCE security UUID.
name‑based – computes MD5 of a name and namespace.
Advantages: generated locally without network I/O, fast; unordered and unpredictable.
Disadvantages: 128‑bit value expands to a 36‑character string, consuming more storage; cannot produce ordered numbers.
Suitable when space overhead is acceptable and ordered IDs are not required, e.g., Log4j’s UuidPatternConverter.
3. Database Auto‑Increment Primary Key
Most straightforward method: use an auto‑increment column as the primary key.
Advantages: simple, ordered, easy for sorting and pagination.
Disadvantages: sharding complications, limited concurrency, IDs can be guessed, and database downtime makes the service unavailable.
Best for low‑volume, low‑concurrency scenarios where ordered IDs are needed.
4. Redis
Redis provides atomic INCR and INCRBY commands.
Advantages: higher performance than a database and supports ordered increments.
Disadvantages: being an in‑memory KV store, data loss is possible even with AOF/RDB, leading to duplicate IDs; reliance on Redis stability.
Suitable when performance is critical and occasional ID duplication is tolerable, or when IDs are reset daily.
5. Zookeeper
Using a ZNode’s version number can generate IDs, but the approach depends heavily on a Zookeeper cluster and has limited performance, so it is generally not recommended.
6. Database Segment + Service Cache (Leaf)
Meituan’s Leaf system improves auto‑increment by allocating ID blocks to proxy servers, reducing database load.
Advantages: higher performance than plain auto‑increment, maintains trend‑increasing IDs, and can survive brief DB outages.
Disadvantages: IDs remain guessable; prolonged DB failure still causes unavailability.
Applicable when trend‑increasing IDs with controllable size are required.
7. Snowflake Algorithm
Twitter’s Snowflake generates 64‑bit integers composed of:
1 sign bit (unused)
41 bits for timestamp (covers ~69 years)
10 bits for machine ID (5 bits data center, 5 bits machine)
12 bits for sequence within the same millisecond (up to 4095 IDs)
The bit allocation can be adjusted to match business needs, e.g., different numbers of data centers or machines.
Suitable for high‑performance, unordered IDs that must not be easily guessed, such as order IDs.
7.1 Simple Snowflake Implementation
public class IdWorker{
private long workerId;
private long datacenterId;
private long sequence = 0;
/** 2018/9/29 start epoch, usable until 2089 */
private long twepoch = 1538211907857L;
private long workerIdBits = 5L;
private long datacenterIdBits = 5L;
private long sequenceBits = 12L;
private long workerIdShift = sequenceBits;
private long datacenterIdShift = sequenceBits + workerIdBits;
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private long sequenceMask = -1L ^ (-1L << sequenceBits);
private long lastTimestamp = -1L;
public IdWorker(long workerId, long datacenterId){
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public synchronized long nextId(){
long timestamp = timeGen();
if(timestamp < lastTimestamp){
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
if(lastTimestamp == timestamp){
sequence = (sequence + 1) & sequenceMask;
if(sequence == 0){
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
}
private long tilNextMillis(long lastTimestamp){
long timestamp = timeGen();
while(timestamp <= lastTimestamp){
timestamp = timeGen();
}
return timestamp;
}
private long timeGen(){
return System.currentTimeMillis();
}
public static void main(String[] args){
IdWorker worker = new IdWorker(1,1);
for(int i=0;i<30;i++){
System.out.println(worker.nextId());
}
}
}7.2 Handling Clock Backward Movement
If the clock moves backwards briefly (e.g., <5 ms), the algorithm can wait until the clock catches up. For longer regressions, either reject the request with an exception or use reserved extension bits to tolerate a limited number of backward jumps.
Conclusion
The article analyzes the principles and appropriate scenarios for various distributed ID generation algorithms, helping readers choose the most suitable strategy for their projects.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
