Backend Development 13 min read

Mastering Distributed ID Generation: From MySQL Auto‑Increment to Snowflake

This article reviews various distributed ID generation strategies—including MySQL auto‑increment tables, multi‑master configurations, segment allocation, Snowflake algorithm, and open‑source solutions like Baidu uid‑generator and Meituan Leaf—highlighting their designs, trade‑offs, and implementation details for high‑performance backend systems.

macrozheng

Sep 19, 2019

Mastering Distributed ID Generation: From MySQL Auto‑Increment to Snowflake

ID is the unique identifier of data. Traditional approaches use UUIDs or database auto‑increment IDs. UUIDs are too long and unordered for InnoDB, while auto‑increment works but causes conflicts after sharding, so a dedicated distributed ID mechanism is needed.

Database Auto‑Increment ID

The first solution relies on a separate MySQL instance with a dedicated table to generate IDs.

CREATE DATABASE `SEQID`;

CREATE TABLE SEQID.SEQUENCE_ID (
    id bigint(20) unsigned NOT NULL auto_increment,
    stub char(10) NOT NULL default '',
    PRIMARY KEY (id),
    UNIQUE KEY stub (stub)
) ENGINE=MyISAM;

Generate an ID with:

begin;
replace into SEQUENCE_ID (stub) VALUES ('anyword');
select last_insert_id();
commit;

The stub column is only a placeholder to trigger the auto‑increment. This method requires a dedicated MySQL instance; each request to obtain an ID hits the database, leading to low performance and a single point of failure.

Database Multi‑Master Mode

Using a master‑master cluster of two MySQL instances can improve reliability, but each instance must be configured with different auto‑increment offsets and steps to avoid duplicate IDs.

set @@auto_increment_offset = 1;  -- start value for instance 1
set @@auto_increment_increment = 2; -- step for instance 1

set @@auto_increment_offset = 2;  -- start value for instance 2
set @@auto_increment_increment = 2; -- step for instance 2

After configuration, instance 1 generates IDs 1,3,5,… and instance 2 generates 2,4,6,… A service such as DistributIdService can request IDs from either instance via RPC. If one instance goes down, the service can still obtain IDs from the other, but scaling beyond two instances becomes cumbersome.

Segment Mode

Instead of requesting a single ID each time, the service can allocate a range (segment) of IDs from the database and serve them locally, reducing database load.

CREATE TABLE id_generator (
  id int(10) NOT NULL,
  current_max_id bigint(20) NOT NULL COMMENT '当前最大id',
  increment_step int(10) NOT NULL COMMENT '号段的长度',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The service updates the table with an optimistic‑lock pattern to claim a new segment:

update id_generator set current_max_id = #{newMaxId}, version = version + 1 where version = #{version};

If the update succeeds, the segment is reserved. This reduces database dependency, but a service restart may create ID gaps.

Snowflake Algorithm

Snowflake, originally from Twitter, generates 64‑bit IDs without a database. The bit layout is:

The highest bit is a sign bit, always 0 for positive IDs.

41 bits store a millisecond timestamp offset from a custom epoch (≈69 years).

10 bits represent a worker/machine ID (e.g., 5 bits for data center, 5 bits for machine).

12 bits are a sequence number, allowing 4096 IDs per millisecond per node.

Implementing Snowflake in Java yields a lightweight ID generator that only requires each node to have a unique worker ID.

Baidu (uid‑generator)

uid‑generator is a Snowflake‑based library where the worker ID is automatically assigned from a database table at startup, supporting Docker deployments. It uses 22 bits for the worker ID, 28 bits for a second‑level timestamp, and 13 bits for a sequence.

Meituan (Leaf)

Leaf supports both segment and Snowflake modes. In its Snowflake mode, the worker ID is obtained from ZooKeeper's sequential nodes, ensuring each instance gets a unique ID without manual configuration.

Summary

All the discussed methods aim to generate globally unique IDs with high availability; choosing the right approach depends on performance requirements, operational complexity, and infrastructure constraints.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

MySQL Snowflake Distributed ID segment allocation

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.