Backend Development 10 min read

How Snowflake Generates 64‑Bit Unique IDs: Deep Dive & Java Implementation

This article explains the SnowFlake distributed ID algorithm, detailing its 64‑bit structure, bit allocation for timestamps, datacenter and worker IDs, sequence numbers, and provides a complete Java implementation along with its advantages and limitations.

Java High-Performance Architecture

Mar 8, 2023

How Snowflake Generates 64‑Bit Unique IDs: Deep Dive & Java Implementation

SnowFlake is a Twitter‑open‑source distributed ID generation algorithm that creates a globally unique 64‑bit long value, widely used in distributed systems because it embeds a timestamp and ensures monotonic increase.

The 64 bits are divided as follows: the most significant bit is unused, 41 bits store the millisecond‑precision timestamp, 5 bits represent the datacenter ID, another 5 bits represent the worker (machine) ID, and the remaining 12 bits serve as a per‑millisecond sequence number.

Because the highest bit is always 0, all generated IDs are positive. The 41‑bit timestamp can represent up to 2^41‑1 milliseconds, roughly 69 years. The 10‑bit machine identifier allows up to 1024 machines, typically split into 5 bits for datacenter (up to 32 datacenters) and 5 bits for worker (up to 32 workers per datacenter). The 12‑bit sequence permits up to 4096 IDs per millisecond on a single machine.

When a service needs a unique ID, it contacts a SnowFlake node that knows its datacenter and worker IDs (e.g., datacenter = 17, worker = 12). The node combines the current timestamp, datacenter ID, worker ID, and sequence number using bit‑shifts to produce the final 64‑bit ID.

SnowFlake algorithm implementation in Java

public class IdWorker {
    // Unused highest bit ensures IDs are positive
    private long workerId;          // 5‑bit worker ID
    private long datacenterId;      // 5‑bit datacenter ID
    private long sequence;          // 12‑bit sequence within the same millisecond
    private long twepoch = 1585644268888L; // custom epoch (≈69 years range)
    private long workerIdBits = 5L;
    private long datacenterIdBits = 5L;
    private long sequenceBits = 12L;
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
    private long workerIdShift = sequenceBits;
    private long datacenterIdShift = sequenceBits + workerIdBits;
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
    private long sequenceMask = -1L ^ (-1L << sequenceBits);
    private long lastTimestamp = -1L;

    public IdWorker(long workerId, long datacenterId, long sequence) {
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        this.workerId = workerId;
        this.datacenterId = datacenterId;
        this.sequence = sequence;
    }

    public synchronized long nextId() {
        long timestamp = timeGen();
        if (timestamp < lastTimestamp) {
            throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
        }
        if (lastTimestamp == timestamp) {
            sequence = (sequence + 1) & sequenceMask;
            if (sequence == 0) {
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            sequence = 0L;
        }
        lastTimestamp = timestamp;
        return ((timestamp - twepoch) << timestampLeftShift) |
               (datacenterId << datacenterIdShift) |
               (workerId << workerIdShift) |
               sequence;
    }

    private long tilNextMillis(long lastTimestamp) {
        long timestamp = timeGen();
        while (timestamp <= lastTimestamp) {
            timestamp = timeGen();
        }
        return timestamp;
    }

    private long timeGen() {
        return System.currentTimeMillis();
    }
}

The algorithm’s main advantages are high performance (no database round‑trip), massive capacity (millions of IDs per second), and monotonic increasing IDs that improve database index efficiency. Its drawbacks include reliance on synchronized system time—clock rollback can cause ID collisions—and the fact that many deployments do not need the full 10‑bit machine space, which can be optimized for specific business needs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend algorithm unique identifier distributed-id

Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.