Why UUID Falls Short and How Snowflake Solves Distributed ID Generation

The article examines the limitations of UUIDs for distributed systems, outlines the strict requirements for global unique IDs, compares common approaches such as database auto‑increment and Redis, and provides a detailed analysis of Twitter's Snowflake algorithm with its structure, Java implementation, advantages, drawbacks, and mitigation strategies.

Architect's Guide
Architect's Guide
Architect's Guide
Why UUID Falls Short and How Snowflake Solves Distributed ID Generation

Problem

In large distributed systems massive amounts of data and messages must be uniquely identified. Examples include finance, payment, catering and hotel services at Meituan‑Dianping, growing data sets after sharding in movie platforms such as Maoyan, and entities like orders, riders and coupons that each require a unique identifier. A system that can generate globally unique IDs is therefore essential.

Hard requirements for ID generation

Global uniqueness – no duplicate IDs may appear.

Trend‑increasing – ordered primary keys (e.g., B‑Tree indexes in MySQL InnoDB) improve write performance.

Monotonic increase – the next ID must be larger than the previous one for use cases such as transaction version numbers.

Information security – unpredictable IDs prevent competitors from inferring order volumes.

Timestamp inclusion – embedding a timestamp enables quick identification of when an ID was generated.

Availability requirements

High availability – 99.999% of ID requests must return a result.

Low latency – responses must be fast.

High QPS – the system should handle at least 100,000 ID requests per second.

Common solutions

UUID

JDK‑provided 36‑character string formatted as 8‑4‑4‑4‑12.

Pros: high performance, generated locally, no network cost.

Cons: unordered, long string, increases database index pressure; MySQL recommends short primary keys.

Because UUIDs are unordered, each insertion modifies the B‑tree index heavily, causing node splits and many under‑filled nodes, which degrades insert performance.

Database auto‑increment primary key

Implemented via REPLACE INTO, which inserts a row or replaces the existing one on unique‑key conflict.

Horizontal scaling is difficult; adding a new machine requires manually setting a large initial offset, which becomes impractical with dozens or hundreds of nodes.

Each ID request incurs a read‑write round‑trip to the database, violating low‑latency and high‑QPS requirements.

Redis global ID strategy

Redis guarantees atomicity with single‑threaded INCR / INCRBY operations.

In a 5‑node Redis cluster, using a step size of 5 and initial values 1‑5 yields the following sequences:

A: 1, 6, 11, 16, 21

B: 2, 7, 12, 17, 22

C: 3, 8, 13, 18, 23

D: 4, 9, 14, 19, 24

E: 5, 10, 15, 20, 25

Snowflake

Overview

Twitter’s distributed incremental ID algorithm. Repository: https://github.com/twitter-archive/snowflake

Time‑ordered generation.

64‑bit integer (max 19‑digit string).

No ID collisions across datacenter and worker IDs; high efficiency.

Structure

Bit layout (sign bit = 0):

41‑bit timestamp (0‑2^41, about 69 years from 1970).

5‑bit datacenter ID and 5‑bit worker ID (total 10 bits, up to 1024 nodes).

12‑bit sequence number (0‑4095) for IDs generated within the same millisecond.

Reference implementation (Java)

public class SnowflakeIdWorker {
    private final long twepoch = 1598598185157L; // start epoch
    private final long workerIdBits = 5L;
    private final long datacenterIdBits = 5L;
    private final long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private final long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    private final long sequenceBits = 12L;
    private final long workerIdShift = sequenceBits;
    private final long datacenterIdShift = sequenceBits + workerIdBits;
    private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private final long sequenceMask = -1L ^ (-1L << sequenceBits);
    private long workerId;
    private long datacenterId;
    private long sequence = 0L;
    private long lastTimestamp = -1L;

    public SnowflakeIdWorker(long workerId, long datacenterId) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
    }

    public synchronized long nextId() {
        long timestamp = timeGen();
        if (timestamp < lastTimestamp) {
            throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        lastTimestamp = timestamp;
        long id = ((timestamp - twepoch) << timestampLeftShift)
                | (datacenterId << datacenterIdShift)
                | (workerId << workerIdShift)
                | sequence;
        return id;
    }

    protected long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }

    protected long timeGen() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowflakeIdWorker idWorker = new SnowflakeIdWorker(0, 0);
        for (int i = 0; i < 1000; i++) {
            long id = idWorker.nextId();
            System.out.println(id);
        }
    }
}

Advantages

Timestamp in high bits makes the ID trend‑increasing.

Low‑order sequence provides uniqueness within the same millisecond.

No reliance on databases or third‑party services; high stability and generation performance (~260,000 IDs per second).

Flexible bit allocation allows adaptation to business needs.

Disadvantages

Depends on machine clocks; clock rollback can cause duplicate IDs.

In a distributed environment clocks are not perfectly synchronized, so global monotonicity is not guaranteed, though trend‑increasing satisfies most scenarios.

Mitigation

Clock synchronization can be achieved with open‑source solutions such as Baidu’s UidGenerator or Meituan‑Dianping’s Leaf.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendJavaredismysqluuidsnowflakedistributed-id
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.