Backend Development 23 min read

Distributed ID Generation: Requirements, Common Solutions, and Implementation Details

This article explains what a distributed ID is, outlines its essential requirements such as global uniqueness, high performance, high availability, and ease of use, and then reviews common generation strategies—including database auto‑increment, segment mode, NoSQL approaches, UUID, Snowflake, and several open‑source frameworks—providing code examples and practical trade‑offs.

Architect
Architect
Architect
Distributed ID Generation: Requirements, Common Solutions, and Implementation Details

Distributed ID Introduction

In daily development we need unique identifiers (IDs) for various data, such as user ID, product ID, and order ID. An ID is a unique marker for a piece of data.

What is a Distributed ID?

A distributed ID is an identifier generated in a distributed system where a single auto‑increment primary key cannot guarantee uniqueness across multiple nodes.

When a system grows and needs sharding, the auto‑increment primary key of a single MySQL instance no longer suffices, and a globally unique ID must be generated for each data node.

Requirements for Distributed IDs

Global uniqueness : IDs must be unique across the entire system.

High performance : Generation speed should be fast with minimal resource consumption.

High availability : The service generating IDs should be near 100% available.

Ease of use : IDs should be ready‑to‑use and easy to integrate.

Additional desirable properties include security (no sensitive information), ordered (monotonically increasing), business meaning, and independent deployment.

Common Distributed ID Solutions

Database

Auto‑increment Primary Key

The simplest method is to use the auto‑increment primary key of a relational database.

CREATE TABLE `sequence_id` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `stub` char(10) NOT NULL DEFAULT '',
  PRIMARY KEY (`id`),
  UNIQUE KEY `stub` (`stub`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

The stub column is a placeholder used only to create a unique index.

Insert data using REPLACE INTO to obtain the generated ID:

BEGIN;
REPLACE INTO sequence_id (stub) VALUES ('stub');
SELECT LAST_INSERT_ID();
COMMIT;

Advantages: simple, ordered IDs, low storage overhead.

Disadvantages: limited concurrency, single‑point‑of‑failure, no business meaning, security concerns, and each ID request hits the database.

Database Segment Mode

To reduce database round‑trips, a segment of IDs is fetched in bulk and kept in memory.

Typical table schema:

CREATE TABLE `sequence_id_generator` (
  `id` int NOT NULL,
  `current_max_id` bigint NOT NULL COMMENT 'current max id',
  `step` int NOT NULL COMMENT 'segment length',
  `version` int NOT NULL COMMENT 'optimistic lock version',
  `biz_type` int NOT NULL COMMENT 'business type',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Workflow:

Insert a row for a business type.

SELECT current_max_id , step , version for the business.

Use the range current_max_id ~ current_max_id+step as the ID pool.

When exhausted, UPDATE the row to increase current_max_id and version , then SELECT again.

Advantages: fewer DB accesses, lower pressure, can be combined with master‑slave replication for high availability.

Disadvantages: still a DB single point of failure, no business meaning, security concerns.

NoSQL

Redis is commonly used. The INCR command provides atomic, sequential ID generation.

127.0.0.1:6379> set sequence_id_biz_type 1
OK
127.0.0.1:6379> incr sequence_id_biz_type
(integer) 2
127.0.0.1:6379> get sequence_id_biz_type
"2"

For higher availability and concurrency, Redis Cluster or Codis can be employed. Persistence can be handled via RDB snapshots, AOF logs, or the mixed mode introduced in Redis 4.0.

MongoDB ObjectId (12 bytes) is another option, consisting of timestamp, machine ID, process ID, and an increment.

Algorithms

UUID

Universally Unique Identifier (UUID) is a 128‑bit value represented as 32 hexadecimal characters. Java provides UUID.randomUUID() (Version 4, random). Different versions have different generation rules (time‑based, name‑based, random, etc.).

// Example output: cb4a9ede-fa5e-4585-b9bb-d60bce986eaa
UUID uuid = UUID.randomUUID();
int version = uuid.version(); // 4

Pros: fast, simple. Cons: large storage, not ordered, may expose MAC address (privacy), no business meaning.

Snowflake (Twitter)

Snowflake generates 64‑bit IDs composed of:

1‑bit sign (always 0)

41‑bit timestamp (ms since custom epoch)

10‑bit datacenter + worker ID

12‑bit sequence number (per‑ms counter)

Pros: fast, ordered, flexible (business type can be embedded). Cons: clock rollback can cause duplicates, fixed machine IDs are less flexible.

Open‑source implementations include Meituan’s Leaf, Baidu’s UidGenerator, and Yitter’s IdGenerator.

Open‑Source Frameworks

UidGenerator (Baidu)

Based on Snowflake but adds a delta‑seconds field (28 bits), a 22‑bit worker ID, and a 13‑bit sequence, supporting up to ~8.7 years from a 2016‑05‑20 epoch.

Leaf (Meituan)

Provides both segment mode and Snowflake mode, supports dual‑segment caching, and solves clock rollback using Zookeeper for worker ID allocation.

Tinyid (Didi)

Implements database segment mode with dual‑segment cache, multi‑DB support, and a lightweight client to avoid HTTP overhead.

IdGenerator (personal)

Snowflake‑based, supports multiple languages (C#, Java, Go, Rust, Python, Node.js, PHP, SQL), solves clock rollback, allows manual ID insertion, and works without external storage.

Conclusion

The article summarizes the most common distributed ID generation schemes, their advantages and drawbacks, and emphasizes that there is no one‑size‑fits‑all solution; the choice must be based on actual project requirements.

References

Tinyid – https://github.com/didi/tinyid/wiki/tinyid%E5%8E%9F%E7%90%86%E4%BB%8B%E7%BB%8D

Codis – https://github.com/CodisLabs/codis

Redis persistence – https://javaguide.cn/database/redis/redis-persistence.html

RFC 4122 – https://tools.ietf.org/html/rfc4122

Wikipedia UUID – https://zh.wikipedia.org/wiki/通用唯一识别码

Seata improved Snowflake – https://seata.io/zh-cn/blog/seata-analysis-UUID-generator.html

UidGenerator – https://github.com/baidu/uid-generator

Leaf – https://github.com/Meituan-Dianping/Leaf

Tinyid – https://github.com/didi/tinyid

IdGenerator – https://github.com/yitter/IdGenerator

Distributed ID design guide – https://javaguide.cn/distributed-system/distributed-id-design.html

databaseredissnowflakedistributed IDunique identifieruid-generator
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.