How to Choose the Right Distributed Unique ID Strategy for Your System
This article explains why globally unique identifiers are essential in distributed systems, outlines the key characteristics of a good ID scheme, and compares several generation methods—including UUID, database auto‑increment, segmented DB ranges, Redis INCR, Zookeeper, Meituan Leaf, Snowflake, and Baidu uid‑generator—highlighting their advantages, drawbacks, and practical implementation details.
What Is a Distributed Unique ID?
In distributed systems a globally unique identifier is required for tasks such as order numbers, coupon codes, or log tracing; duplicate IDs can cause critical bugs. While a single‑machine application can rely on a simple atomic counter, a multi‑machine environment must coordinate ID generation to guarantee uniqueness.
Key Characteristics of a Good Global ID
Global uniqueness : No duplicates across machines or data centers.
Monotonic or roughly ordered : Helps with sorting, range queries and pagination.
High performance : Low latency generation.
High availability : The service must survive node failures.
Ease of use : Simple integration for callers.
Security considerations : Prevent predictable IDs that could be guessed.
Generation Schemes
UUID Direct Generation
UUID (Universally Unique Identifier) or GUID (Globally Unique Identifier) is a 128‑bit value usually represented as a 32‑character hexadecimal string. It is generated locally without network calls. String uuid = UUID.randomUUID(); Pros: Fast, no network, virtually no collisions.
Cons: Long string, consumes more storage, not naturally ordered, lacks business meaning.
Database Auto‑Increment
Using the auto‑increment primary key of a relational table provides a simple, ordered numeric ID.
CREATE DATABASE `test`;
USE `test`;
CREATE TABLE id_table (
id BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT,
value CHAR(10) NOT NULL DEFAULT '',
PRIMARY KEY (id)
) ENGINE=MyISAM;
INSERT INTO id_table(value) VALUES('v1');Pros: Simple, fast, naturally ordered, easy to index.
Cons: Single‑node bottleneck, not highly available; scaling requires sharding or manual offset/step configuration.
Segmented (Batch) ID from DB
Instead of requesting an ID for every insert, a service fetches a block of IDs (e.g., 1000) and serves them from memory, reducing DB load.
每次取1000,每台步长3000
V1:1-1000,3001-4000,
V2:1001-2000,4001-5000
V3:2001-3000,5001-6000Reduces database round‑trips.
Uses optimistic lock to guarantee only one allocator updates the range.
Requires careful handling of step changes and machine additions.
Redis INCR
Redis provides an atomic INCR command that increments a key stored in memory, offering very low latency.
127.0.0.1:6379> set id 1
OK
127.0.0.1:6379> incr id
(integer) 2Pros: In‑memory speed, natural ordering.
Cons: Persistence trade‑offs (RDB snapshots may lose recent IDs; AOF may impact performance), step adjustment is hard in a cluster.
Zookeeper
Zookeeper can generate sequential numbers via znode versions, but performance is lower and a distributed lock is often required, making it less attractive for high‑throughput ID generation.
Meituan Leaf
Leaf is an open‑source ID service from Meituan. It offers two versions:
V1 (pre‑allocation) : Allocates a range of IDs in advance; suffers from high update latency and potential unavailability during master failover.
V2 (Leaf‑Snowflake) : Implements a Snowflake‑like algorithm with weak dependency on Zookeeper; caches a worker ID locally to survive Zookeeper outages.
Snowflake (Twitter)
Snowflake generates 64‑bit long IDs composed of a timestamp, datacenter ID, worker ID, and a per‑millisecond sequence. It guarantees monotonic increase and high throughput.
public class SnowFlake {
private long datacenterId;
private long workerId;
private long sequence;
// ... fields for bit lengths, masks, epoch, etc.
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
throw new RuntimeException("Clock moved backwards");
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift)
| (datacenterId << datacenterIdShift)
| (workerId << workerIdShift)
| sequence;
}
// helper methods omitted for brevity
}Pros: High performance, ordered IDs, no external coordination.
Cons: Requires careful handling of clock rollback; machine count limited by bit allocation.
Baidu uid‑generator
Baidu’s open‑source uid‑generator is a Java implementation based on Snowflake with customizable bit allocations, ring‑buffer caching, and optimizations for containerized environments. It can reach 6 M QPS on a single node.
https://github.com/baidu/uid-generatorChoosing the Right Approach
The optimal ID solution depends on business requirements and system scale. Centralized methods (e.g., DB segment, Redis, Zookeeper) provide ordered IDs but add coordination complexity. Decentralized methods (UUID, Snowflake, Leaf‑Snowflake) are simple and fast but may produce longer or less predictable IDs. Evaluate trade‑offs such as uniqueness guarantees, ordering needs, latency, availability, and operational overhead before selecting a strategy.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
