Backend Development 14 min read

Distributed Global Unique ID Generation: Requirements, Common Solutions, and Snowflake Implementation

The article explains why distributed systems need globally unique identifiers, outlines strict generation rules and availability requirements, compares common approaches such as UUID, database auto‑increment and Redis, and provides a detailed overview and Java implementation of Twitter's Snowflake algorithm.

Top Architect
Top Architect
Top Architect
Distributed Global Unique ID Generation: Requirements, Common Solutions, and Snowflake Implementation

1. Problem

In complex distributed systems large amounts of data and messages must be uniquely identified, for example in finance, payment, restaurant, hotel, and ticketing platforms, where data sharding and high‑volume entities like orders, riders, and coupons require a globally unique ID.

ID Generation Rules (hard requirements)

Global uniqueness : no duplicate IDs.

Trend increasing : IDs should be roughly ordered to preserve InnoDB clustered index performance.

Monotonic increasing : each subsequent ID must be greater than the previous one.

Security : IDs should not be easily guessable to prevent malicious scraping.

Timestamp inclusion : the generation time should be embedded for easy debugging.

Availability requirements

High availability : 99.999% success rate for ID requests.

Low latency : fast response time.

High QPS : ability to generate hundreds of thousands of IDs per second.

2. Common General Solutions

(1) UUID

Generated by JDK, 36‑character string (8‑4‑4‑4‑12). Advantages: high performance, local generation, no network cost. Disadvantages: unordered, long string, heavy on database indexes, reduces insert performance.

(2) Database auto‑increment primary key

Implemented via REPLACE INTO . Not suitable for distributed ID because horizontal scaling is difficult and database becomes a bottleneck under high QPS.

(3) Redis‑based ID generation

Uses atomic INCR / INCRBY . In a Redis cluster each node can be initialized with a different start value and step (e.g., 5 nodes with start values 1‑5 and step 5) to produce non‑overlapping sequences.

3. Snowflake

(1) Overview

Twitter's distributed self‑incrementing ID algorithm. Official repository: https://github.com/twitter-archive/snowflake.

(2) Structure

64‑bit ID layout (from high to low bits): • 1 sign bit (always 0) • 41 bits timestamp (millisecond precision, offset from a custom epoch, supports ~69 years) • 5 bits datacenter ID • 5 bits worker ID • 12 bits sequence number (supports 4096 IDs per millisecond per node)

(3) Java Implementation

public class SnowflakeIdWorker {
    private final long twepoch = 1598598185157L;
    private final long workerIdBits = 5L;
    private final long datacenterIdBits = 5L;
    private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    private final long sequenceBits = 12L;
    private final long workerIdShift = sequenceBits;
    private final long datacenterIdShift = sequenceBits + workerIdBits;
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private final long sequenceMask = -1L ^ (-1L << sequenceBits);
    private long workerId;
    private long datacenterId;
    private long sequence = 0L;
    private long lastTimestamp = -1L;
    public SnowflakeIdWorker(long workerId, long datacenterId) { /* validation omitted */ }
    public synchronized long nextId() { /* core algorithm omitted for brevity */ }
    protected long tilNextMillis(long lastTimestamp) { /* wait for next millisecond */ }
    protected long timeGen() { return System.currentTimeMillis(); }
    public static void main(String[] args) { /* test loop */ }
}

(4) Pros and Cons

Advantages : IDs are trend‑increasing, no external dependencies, high performance, flexible bit allocation.

Disadvantages : Relies on synchronized system clocks; clock rollback can cause duplicate IDs, and global monotonicity is not guaranteed across nodes (generally acceptable).

4. Solutions to Clock Issues

Use Baidu's open‑source UID generator.

Use Meituan‑Dianping's Leaf distributed ID system.

Overall, Snowflake provides a robust, high‑throughput method for generating globally unique IDs in distributed backend services.

backendRedisUUIDsnowflakedistributed IDunique identifier
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.