Distributed ID Concepts, Implementation Schemes, and Open‑Source Solutions
This article explains the need for globally unique identifiers in distributed systems, compares common ID generation schemes such as UUID, auto‑increment, Redis counters, and Snowflake, provides a Java implementation of the Snowflake algorithm, and reviews open‑source components like Meituan Leaf and Baidu UidGenerator.
1. Distributed ID Concept
In the human world an ID uniquely identifies a person, and in complex distributed systems a similar globally unique identifier is required for massive data and messages. Traditional auto‑increment primary keys work for monolithic databases, but after sharding a globally unique ID is needed, which must also satisfy high concurrency, high availability, and high performance.
2. Distributed ID Implementation Schemes
The table below compares several common solutions:
Description
Advantages
Disadvantages
UUID
Universally Unique Identifier that provides uniqueness without a central coordinator.
1) Reduces pressure on global nodes, faster primary‑key generation; 2) Globally unique; 3) Easy data merging across servers.
1) Occupies 16 characters, high space cost; 2) Not sequential, causing random I/O and lower index efficiency.
Database Auto‑Increment
MySQL auto‑increment primary key.
1) Small INT/BIGINT footprint; 2) Sequential I/O; 3) Numeric queries faster than strings.
1) Limited concurrency, bound by DB performance; 2) Sharding requires redesign; 3) Auto‑increment may expose data volume.
Redis Auto‑Increment
Atomic counter in Redis.
Uses memory, excellent concurrency.
1) Possible data loss; 2) Auto‑increment may expose data volume.
Snowflake Algorithm
Classic Snowflake algorithm for distributed IDs.
1) No external dependencies; 2) High performance.
Clock rollback issues.
Currently two popular distributed ID solutions dominate:
Segment Mode – relies on a database but differs from simple auto‑increment; a segment (e.g., 100 IDs) is allocated at once, greatly improving performance.
Snowflake Algorithm – composed of a sign bit, timestamp, data‑center ID, machine ID, and sequence number, as illustrated below:
The sign bit is 0, indicating a positive number. The timestamp (in milliseconds) records the time. The machine ID is usually split into 5 bits for region and 5 bits for server identifier. The sequence number is an auto‑increment within the same millisecond.
Snowflake capacity: time range 2^41 / (365·24·60·60·1000) ≈ 69 years; worker ID range 2^10 = 1024; sequence range 2^12 = 4096 IDs per millisecond.
The algorithm can be implemented as a simple Java utility, allowing each business service to obtain IDs directly as long as it has a unique machine ID.
public class SnowFlake {
/** start timestamp */
private static final long START_STMP = 1480166465631L;
/** bits allocated to each part */
private static final long SEQUENCE_BIT = 12; // sequence bits
private static final long MACHINE_BIT = 5; // machine bits
private static final long DATACENTER_BIT = 5; // data‑center bits
/** max values */
private static final long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT);
private static final long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
private static final long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
/** left shift values */
private static final long MACHINE_LEFT = SEQUENCE_BIT;
private static final long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
private static final long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT;
private long datacenterId; // data‑center
private long machineId; // machine
private long sequence = 0L;
private long lastStmp = -1L;
public SnowFlake(long datacenterId, long machineId) {
if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) {
throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0");
}
if (machineId > MAX_MACHINE_NUM || machineId < 0) {
throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0");
}
this.datacenterId = datacenterId;
this.machineId = machineId;
}
/** generate next ID */
public synchronized long nextId() {
long currStmp = getNewstmp();
if (currStmp < lastStmp) {
throw new RuntimeException("Clock moved backwards. Refusing to generate id");
}
if (currStmp == lastStmp) {
// same millisecond, increment sequence
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0L) {
// sequence overflow, wait for next millisecond
currStmp = getNextMill();
}
} else {
// different millisecond, reset sequence
sequence = 0L;
}
lastStmp = currStmp;
return (currStmp - START_STMP) << TIMESTMP_LEFT // timestamp
| datacenterId << DATACENTER_LEFT // data‑center
| machineId << MACHINE_LEFT // machine
| sequence; // sequence
}
private long getNextMill() {
long mill = getNewstmp();
while (mill <= lastStmp) {
mill = getNewstmp();
}
return mill;
}
private long getNewstmp() {
return System.currentTimeMillis();
}
public static void main(String[] args) {
SnowFlake snowFlake = new SnowFlake(2, 3);
for (int i = 0; i < (1 << 12); i++) {
System.out.println(snowFlake.nextId());
}
}
}3. Open‑Source Distributed ID Components
3.1 How to Choose an Open‑Source Component
Select a component by first confirming that its features meet your requirements, focusing on compatibility and extensibility.
Second, consider your current technical capabilities and whether your team’s stack can integrate the component smoothly.
Third, evaluate the community: update frequency, maintenance status, availability of support, and industry adoption.
3.2 Meituan Leaf
Leaf is a distributed ID service launched by Meituan’s R&D platform, named after Leibniz’s quote “There are no two identical leaves in the world.” It offers high reliability, low latency, and global uniqueness, and is used across Meituan’s finance, food delivery, and travel divisions. The project is open‑source on GitHub.
Globally unique and monotonically increasing.
Highly available; tolerates MySQL outages.
High concurrency with QPS > 50,000 and 99th‑percentile latency < 1 ms on a 4C8G VM.
Simple integration via RPC or HTTP.
3.3 Baidu UidGenerator
UidGenerator is Baidu’s open‑source high‑performance ID generator based on the Snowflake algorithm. It supports customizable worker‑ID bits and initialization strategies, uses future timestamps to avoid sequence bottlenecks, employs a RingBuffer with cache‑line padding to eliminate false sharing, and can reach 6 million QPS on a single machine. The source code is available on GitHub.
3.4 Open‑Source Component Comparison
UidGenerator is Java‑based, last updated two years ago, minimally maintained, and only supports Snowflake.
Leaf is also Java‑based, last maintained in 2020, and supports both segment mode and Snowflake.
Overall, based on theory and feature comparison, Meituan Leaf is the preferable choice.
Do you know other common distributed ID solutions?
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.