Why UUID Falls Short and How Snowflake Solves Distributed ID Generation

The article examines the limitations of using UUIDs for distributed unique identifiers, compares common alternatives such as database auto‑increment and Redis, and then details the Snowflake algorithm’s structure, implementation, advantages, and drawbacks for high‑performance ID generation.

IoT Full-Stack Technology
IoT Full-Stack Technology
IoT Full-Stack Technology
Why UUID Falls Short and How Snowflake Solves Distributed ID Generation

Problem

In complex distributed systems a massive amount of data and messages need globally unique identifiers. Scenarios include finance, payment, restaurant, hotel services, movie platforms, and order, rider, coupon systems. A system that can generate globally unique IDs is therefore essential.

ID Generation Hard Requirements

Globally unique : No duplicate IDs.

Trend‑increasing : Use ordered primary keys to keep InnoDB B‑tree insert performance high.

Monotonically increasing : The next ID must be larger than the previous one for versioning, sorting, etc.

Security : Random‑looking IDs make it harder for attackers to guess order volumes.

Timestamp embedded : Allows developers to quickly infer when an ID was generated.

ID Service Availability Requirements

High availability : 99.999% of requests must return an ID.

Low latency : ID generation must be fast.

High QPS : The service should sustain tens of thousands of IDs per second (e.g., 100 k IDs/s).

Common Solutions

(1) UUID

Generated by the JDK, 36‑character string in the form 8‑4‑4‑4‑12. Suitable for single‑node uniqueness.

Pros: high performance, generated locally, no network cost.

Cons: unordered, long string, increases DB storage and degrades insert performance; MySQL recommends short primary keys.

Because UUIDs are unordered, each insert causes large B+‑tree modifications, node splits, and many under‑filled nodes, dramatically reducing database insert throughput.

(2) Database Auto‑Increment Primary Key

Implemented via REPLACE INTO. This approach is not suitable for distributed ID generation because:

Horizontal scaling is difficult; adding a new machine requires redefining step sizes and initial values, which becomes a nightmare with dozens or hundreds of nodes.

The database becomes a bottleneck: every ID request incurs a read‑write round‑trip, violating low‑latency and high‑QPS requirements.

(3) Redis Global ID Strategy

Redis guarantees atomicity with single‑threaded execution; INCR and INCRBY can be used.

In a Redis cluster, different step sizes must be configured for each shard, and keys should have an expiration.

Using a 5‑node Redis cluster, initialize each node with values 1, 2, 3, 4, 5 and a step size of 5. The generated IDs are:

A: 1, 6, 11, 16, 21

B: 2, 7, 12, 17, 22

C: 3, 8, 13, 18, 23

D: 4, 9, 14, 19, 24

E: 5, 10, 15, 20, 25

Snowflake

(1) Overview

Twitter’s distributed auto‑increment ID algorithm. Repository: https://github.com/twitter-archive/snowflake

Generates time‑ordered IDs.

Result is a 64‑bit integer (max 19‑digit decimal string).

No collisions across the distributed system (datacenter and worker IDs differentiate nodes) and high efficiency.

(2) Structure

Snowflake bit layout
Snowflake bit layout

1 sign bit (always 0 for positive IDs).

41 bits timestamp (millisecond offset from a custom epoch, supports ~69 years).

5 bits datacenter ID (max 31).

5 bits worker ID (max 31).

12 bits sequence number (max 4095) for IDs generated within the same millisecond.

(3) Code

/**
 * Twitter_Snowflake
 * SnowFlake structure (each part separated by '-'): 
 * 0 - 0000000000 0000000000 0000000000 0000000000 0 - 00000 - 00000 - 000000000000 
 * 1 sign bit (0 for positive numbers).
 * 41‑bit timestamp (millisecond offset from a custom epoch).
 * 10‑bit machine identifier (5‑bit datacenter + 5‑bit worker).
 * 12‑bit sequence within the same millisecond (supports 4096 IDs per ms).
 * Total 64‑bit Long value.
 * SnowFlake can generate ~260k IDs per second in tests.
 */
public class SnowflakeIdWorker {
    // ==============================Fields===========================================
    /** start epoch (2020‑08‑28) */
    private final long twepoch = 1598598185157L;
    /** number of bits for worker id */
    private final long workerIdBits = 5L;
    /** number of bits for datacenter id */
    private final long datacenterIdBits = 5L;
    /** max worker id (31) */
    private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
    /** max datacenter id (31) */
    private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    /** bits for sequence */
    private final long sequenceBits = 12L;
    /** left shift for worker id */
    private final long workerIdShift = sequenceBits;
    /** left shift for datacenter id */
    private final long datacenterIdShift = sequenceBits + workerIdBits;
    /** left shift for timestamp */
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    /** mask for sequence (4095) */
    private final long sequenceMask = -1L ^ (-1L << sequenceBits);
    /** worker id (0~31) */
    private long workerId;
    /** datacenter id (0~31) */
    private long datacenterId;
    /** current sequence (0~4095) */
    private long sequence = 0L;
    /** last timestamp */
    private long lastTimestamp = -1L;
    //==============================Constructors=====================================
    public SnowflakeIdWorker(long workerId, long datacenterId) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }
    // ==============================Methods==========================================
    /**
     * Core method – get next ID (thread‑safe)
     * @return Snowflake ID
     */
    public synchronized long nextId() {
        // 1. get current timestamp
        long timestamp = timeGen();
        // clock moved backwards?
        if (timestamp < lastTimestamp) {
            throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }
        // same millisecond?
        if (lastTimestamp == timestamp) {
            // increment sequence and mask overflow
            sequence = (sequence + 1) & sequenceMask;
            // sequence overflow – wait for next millisecond
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            // new millisecond – reset sequence
            sequence = 0L;
        }
        // update last timestamp
        lastTimestamp = timestamp;
        // assemble 64‑bit ID
        long id = ((timestamp - twepoch) << timestampLeftShift)
                | (datacenterId << datacenterIdShift)
                | (workerId << workerIdShift)
                | sequence;
        return id;
    }
    /** Block until next millisecond */
    protected long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }
    /** Current time in milliseconds */
    protected long timeGen() {
        return System.currentTimeMillis();
    }
    //==============================Test=============================================
    public static void main(String[] args) {
        SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0, 0);
        for (int i = 0; i < 1000; i++) {
            long id = idWorker.nextId();
            System.out.println(id);
        }
    }
}

(4) Pros and Cons

Advantages : Timestamp occupies high bits and sequence occupies low bits, making the ID trend‑increasing. No reliance on databases or third‑party systems; can be deployed as a service with high stability and performance. Bit allocation is flexible to match business needs.

Disadvantages : Depends on machine clocks; if a clock drifts backward, duplicate IDs may appear. In distributed environments clocks are not perfectly synchronized, so global monotonicity is not guaranteed—though most use cases only require trend‑increasing IDs.

Mitigation : Synchronize clocks using open‑source solutions such as Baidu’s UidGenerator or Meituan‑Dianping’s Leaf distributed ID generator.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaredisUUIDSnowflakeDistributed IDID Generation
IoT Full-Stack Technology
Written by

IoT Full-Stack Technology

Dedicated to sharing IoT cloud services, embedded systems, and mobile client technology, with no spam ads.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.