Mastering Distributed ID Generation: From UUID to Snowflake and Beyond
This article explores various distributed ID generation strategies—including UUID, database auto‑increment, segment mode, Redis INCR, Snowflake, Meituan Leaf, Baidu UidGenerator, and Didi TinyID—detailing their principles, advantages, drawbacks, and Java code examples to help developers choose the right solution for high‑scale systems.
Article Directory
Background
1. UUID
2. Database Auto‑Increment ID
3. Segment (Range) Mode
4. Redis INCR
5. Snowflake Algorithm
6. Meituan (Leaf)
7. Baidu (UidGenerator)
8. Didi (TinyID)
Comparison
Background
In complex distributed systems, unique identifiers are required for massive data, such as orders that are sharded across databases where a simple auto‑increment primary key cannot guarantee global uniqueness. Additional requirements include trend‑increasing keys for B‑tree indexes, monotonic increase for sorting, and security to prevent easy enumeration.
Trend increasing: Ordered primary keys improve write performance in B‑tree indexes.
Monotonic increasing: Guarantees that each new ID is larger than the previous one, useful for sorting.
Information security: Sequential IDs expose data volume and can be exploited; random or irregular IDs improve security.
Various distributed ID solutions have emerged to meet these scenarios; this article introduces several, discussing their pros, cons, use cases, and code samples.
1. UUID
UUID (Universally Unique Identifier) is generated from the current time, a counter, and hardware identifiers (often a MAC address). It consists of 32 hexadecimal characters formatted as 8‑4‑4‑4‑12, providing global uniqueness but with performance trade‑offs.
JDK provides a UUID generator:
import java.util.UUID;
public class Test {
public static void main(String[] args) {
System.out.println(UUID.randomUUID());
}
}b0378f6a-eeb7-4779-bffe-2a9f3bc76380
Despite its uniqueness, UUID is rarely used in practice for the following reasons:
High storage cost: 16 bytes (128 bits) usually stored as a 36‑character string.
Security concerns: MAC‑address‑based generation can expose hardware information.
MySQL primary key issues: Long, unordered keys degrade InnoDB index performance.
2. Database Auto‑Increment ID
MySQL auto‑increment IDs work within a single table but cannot guarantee global uniqueness across sharded tables. Two common mitigation approaches are presented.
2.1 Primary Key Table
Create a dedicated table to generate unique IDs:
CREATE TABLE `unique_id` (
`id` bigint NOT NULL AUTO_INCREMENT,
`biz` char(1) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `biz` (`biz`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;Retrieve an ID with a REPLACE statement and LAST_INSERT_ID:
BEGIN;
REPLACE INTO unique_id (biz) VALUES ('o');
SELECT LAST_INSERT_ID();
COMMIT;2.2 ID Auto‑Increment Step
Configure different auto‑increment steps for each MySQL instance (e.g., step = 1 for instance 1, step = 2 for instance 2) to avoid collisions. View the settings:
show variables like '%increment%';High concurrency can still pose scalability challenges.
3. Segment (Range) Mode
Segment mode allocates a range of IDs from the database to a service’s memory. The service consumes IDs locally until the range is exhausted, then requests a new range using an optimistic‑lock version field.
CREATE TABLE id_generator (
id int NOT NULL,
max_id bigint NOT NULL COMMENT 'current max id',
step int NOT NULL COMMENT 'segment length',
biz_type int NOT NULL COMMENT 'business type',
version int NOT NULL COMMENT 'optimistic lock version',
PRIMARY KEY (`id`)
);This approach reduces database load but can suffer from non‑continuous IDs after server restarts.
4. Redis INCR
Redis’s atomic INCR command can serve as a global counter. A Java example:
@Component
public class RedisDistributedId {
@Autowired
private StringRedisTemplate redisTemplate;
private static final long BEGIN_TIMESTAMP = 1659312000L;
public long nextId(String item) {
LocalDateTime now = LocalDateTime.now();
long nowSecond = now.toEpochSecond(ZoneOffset.UTC);
long timestamp = nowSecond - BEGIN_TIMESTAMP;
String date = now.format(DateTimeFormatter.ofPattern("yyyy:MM:dd"));
Long increment = redisTemplate.opsForValue().increment("id:" + item + ":" + date);
return (timestamp << 32) | increment;
}
}Redis persistence and failover strategies must be considered.
5. Snowflake Algorithm
Twitter’s Snowflake splits a 64‑bit integer into timestamp (41 bits), datacenter ID (5 bits), worker ID (5 bits), and sequence (12 bits), allowing up to 4096 IDs per millisecond.
public class SnowflakeIdWorker {
private final long twepoch = 1604374294980L;
private final long workerIdBits = 5L;
private final long datacenterIdBits = 5L;
private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private final long sequenceBits = 12L;
private final long workerIdShift = sequenceBits;
private final long datacenterIdShift = sequenceBits + workerIdBits;
private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private final long sequenceMask = -1L ^ (-1L << sequenceBits);
private long workerId;
private long datacenterId;
private long sequence = 0L;
private long lastTimestamp = -1L;
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d ms", lastTimestamp - timestamp));
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
}
protected long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
protected long timeGen() {
return System.currentTimeMillis();
}
public static String getSnowId() {
SnowflakeIdWorker sf = new SnowflakeIdWorker();
return String.valueOf(sf.nextId());
}
}Clock rollback can cause duplicate IDs; additional safeguards are required.
6. Meituan (Leaf)
Leaf is an open‑source project from Meituan that supports both segment mode and Snowflake mode (the latter relies on ZooKeeper to assign worker IDs).
7. Baidu (UidGenerator)
Baidu’s UidGenerator is a Java implementation based on Snowflake, achieving >6 000 000 QPS with MySQL for worker‑ID allocation. It reduces the timestamp to 28 bits, limiting the usable period to about 8.5 years unless the bit allocation is reconfigured.
8. Didi (TinyID)
TinyID, derived from Meituan’s Leaf segment algorithm, provides HTTP and client libraries but only supports segment mode. It adds multi‑master database support and a convenient client SDK.
Comparison
A visual comparison of the discussed solutions:
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
