Distributed ID Generation: Principles, Requirements, and Common Solutions
The article explains why traditional auto‑increment primary keys are unsuitable for distributed systems, outlines the key requirements for a distributed identifier, and reviews several practical generation schemes—including UUID, database auto‑increment, segment mode, Redis, Snowflake, Baidu UidGenerator, Meituan Leaf, and Didi TinyID—along with their advantages, drawbacks, and sample implementations.
Why Distributed ID Is Needed
In monolithic applications primary‑key auto‑increment works well, but after sharding databases the same auto‑increment value can appear in different tables, causing duplicate IDs that break business logic.
Business Requirements for Distributed IDs
Global uniqueness : each ID must be unique across all nodes.
Trend‑increasing : ordered IDs improve write performance for InnoDB clustered indexes.
Monotonic increasing : the next ID should be larger than the previous one, useful for versioning and sorting.
Security : IDs should not be strictly sequential to avoid easy enumeration by attackers.
Distributed ID Generation Schemes
1. UUID
UUID (Universally Unique Identifier) is a 128‑bit value represented as 36 characters (8‑4‑4‑4‑12). It provides very high generation performance because it is created locally without network calls, but its size and lack of order can hurt storage and index efficiency.
2. Database Auto‑Increment
MySQL auto‑increment can be used across multiple machines by configuring auto_increment_increment and auto_increment_offset . Each server gets a distinct step and start value (e.g., step = 2, server 1 starts at 1, server 2 starts at 2). This method is simple but requires re‑configuration when scaling and still puts load on the database.
3. Segment Mode
IDs are allocated in blocks (segments) from a central table and cached in memory. When a segment is exhausted, the service fetches the next range.
CREATE TABLE id_generator (
id int(10) NOT NULL,
max_id bigint(20) NOT NULL COMMENT 'current max id',
step int(20) NOT NULL COMMENT 'segment step length',
biz_type int(20) NOT NULL COMMENT 'business type',
version int(20) NOT NULL COMMENT 'optimistic lock version',
PRIMARY KEY (`id`)
);When a segment is used up, the service updates the table:
UPDATE id_generator SET max_id = #{max_id+step}, version = version + 1
WHERE version = #{version} AND biz_type = XXX;Advantages: mature solutions (e.g., Baidu UidGenerator, Meituan Leaf). Drawbacks: still depends on the database.
4. Redis Implementation
Redis provides atomic INCR and INCRBY commands, guaranteeing uniqueness and order because Redis is single‑threaded. It scales well with clustering but introduces a Redis dependency.
5. Snowflake Algorithm
Twitter’s Snowflake splits a 64‑bit integer into sign (1 bit), timestamp (41 bits), machine ID (10 bits), and sequence (12 bits). It yields trend‑increasing IDs without external services, but it relies heavily on a correct system clock; clock rollback can cause duplicates.
6. Baidu UidGenerator
Based on Snowflake with improvements: customizable worker‑ID bits, future‑time usage to avoid sequence limits, RingBuffer caching, and can reach 6 M QPS per node.
7. Meituan Leaf
Leaf offers two modes: segment (Leaf‑segment) and Snowflake (Leaf‑snowflake). The segment mode improves database pressure by batch‑fetching ID blocks; the Snowflake mode follows the classic 1‑41‑10‑12 bit layout and can obtain worker IDs automatically via Zookeeper.
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| biz_tag | varchar(128) | NO | PRI | | |
| max_id | bigint(20) | NO | | 1 | |
| step | int(11) | NO | | NULL | |
| desc | varchar(256) | YES | | NULL | |
| update_time | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+-------------+--------------+------+-----+-------------------+-----------------------------+8. Didi TinyID
TinyID is a Java‑based distributed ID system built on the segment algorithm, supporting multiple databases and a client library. It offers easy integration but still depends on database availability.
Each solution has its own trade‑offs; the choice depends on factors such as performance requirements, operational complexity, and infrastructure constraints.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.