How to Build a High‑Availability, High‑Performance Distributed ID Generator
Distributed systems need globally unique, often monotonic IDs, and this article examines common ID generation strategies—Snowflake, database auto‑increment, segment allocation, multi‑master databases, and Raft‑based consensus—evaluating each for high availability and high performance, and highlighting trade‑offs and implementation details.
Background
In distributed scenarios, many places need globally unique IDs, e.g., after database sharding you need a unique ID instead of a single‑machine auto‑increment. The basic requirements for an ID generator are:
Globally unique, never duplicate
Some scenarios also require monotonic increase, such as sorting.
Many articles exist, e.g., Meituan's "Leaf" and Youzan's "How to Build a Reliable ID Generator". This article focuses on high availability and high performance.
High availability: the service remains usable and no duplicate IDs are generated despite failures
High performance: the generator must handle very high concurrency and be horizontally scalable
Given these basic requirements, what common solutions exist, and are they truly high‑availability and high‑performance?
Snowflake solution
snowflakeuses a 41‑bit timestamp, 10‑bit machine ID, and 12‑bit sequence. The sequence can be generated with an AtomicLong, and the 10‑bit machine ID supports up to 1024 machines.
Advantages:
Simple algorithm, easy to implement, no third‑party dependencies, very high performance
Stateless cluster, easy to scale, considered highly available
Disadvantages:
Timestamp ensures monotonicity, but machine IDs cannot guarantee order across machines
Relies on the clock; if the clock moves backward, duplicate IDs may be generated
Overall, Snowflake meets the basic requirements and offers very high performance, but due to clock‑rollback issues it is not a high‑availability solution.
Database‑based solution
Using the database auto‑increment feature:
Simple implementation, only depends on the database
No clock‑rollback problem
Generated IDs are monotonic
Drawbacks:
Performance limited by the database’s single‑node write capacity; cannot scale horizontally
Single point of failure; in master‑slave setups consistency depends on replication mode (asynchronous, semi‑synchronous, or full synchronous). Only full synchronous replication guarantees availability; otherwise, failover may cause duplicate IDs.
The same idea can use Redis incr, but Redis only offers asynchronous replication, further reducing consistency guarantees.
In summary, without full synchronous replication the database approach is not highly available, and even with it performance suffers.
Database segment allocation solution
This optimizes performance by fetching a range of IDs from the database and allocating them locally. It greatly improves performance over the simple database approach, but loses monotonicity. With full synchronous replication it can be both highly available and high performance.
Multi‑master database solution
Similar to the segment approach, but uses multiple databases with distinct auto‑increment offsets (e.g., three masters with start values 1, 2, 3 and step 3). IDs from each master never collide. A round‑robin strategy fetches segments; if one master fails, others continue. This provides high availability; high performance is achieved via segment allocation, though horizontal scaling of databases is difficult.
Consensus‑based solution
The high‑availability issue stems from master‑slave inconsistency. Using a consensus protocol like Raft ensures data is replicated to a majority of nodes. After each segment is allocated, it is persisted to a majority and eventually to all nodes; if the leader fails, Raft elects a new one. Youzan’s reliable ID generator uses etcd and Raft; open‑source Raft libraries such as Ant Financial’s SOFAJRaft can also be used.
Summary
High performance of ID generators mainly relies on segment allocation.
High availability can be achieved through database high‑availability, multi‑master setups, or consensus protocols.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xiao Lou's Tech Notes
Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
