Backend Development 11 min read

Mastering Snowflake: Build High‑Performance Distributed IDs in Java

This article explains the Snowflake distributed ID algorithm, compares it with other ID generation methods, details its 64‑bit structure, provides a complete Java implementation with thread‑safe code, and discusses practical limits such as the 69‑year timestamp and front‑end numeric handling.

Senior Brother's Insights

Feb 22, 2022

Mastering Snowflake: Build High‑Performance Distributed IDs in Java

Introduction

Generating unique identifiers is a common requirement in distributed systems and business applications. IDs must be globally unique, roughly ordered, highly available, and performant. The Snowflake algorithm, originally open‑sourced by Twitter, satisfies these constraints.

Common Distributed ID Generation Techniques

UUID : Java built‑in API that creates a 36‑character random string. Guarantees uniqueness but lacks readability and monotonic ordering.

Snowflake : Twitter’s 64‑bit integer ID algorithm. High performance and monotonic on a single node. Repository: https://github.com/twitter-archive/snowflake/tree/snowflake-2010

UidGenerator : Baidu’s open‑source generator based on Snowflake. Documentation: https://github.com/baidu/uid-generator/blob/master/README.zh_cn.md

Leaf : Meituan’s open‑source generator that ensures global uniqueness and trend‑increasing IDs, but depends on relational databases and Zookeeper. Implementation reference: https://tech.meituan.com/2017/04/21/mt-leaf.html

Snowflake Algorithm Overview

The algorithm encodes a timestamp, datacenter identifier, worker identifier, and a sequence number into a single 64‑bit signed integer, providing natural ordering while preserving uniqueness across nodes.

Bit Allocation

The 64‑bit ID is divided as follows:

1 bit: Unused sign bit (always 0).

41 bits: Millisecond‑precision timestamp (supports ~69 years).

10 bits: Machine identifier (5‑bit datacenter ID + 5‑bit worker ID), allowing up to 1024 nodes.

12 bits: Sequence number within the same millisecond (up to 4095 IDs per node per millisecond).

In Java the 64‑bit value is stored in a long.

Java Implementation

The following Java class implements the Snowflake algorithm with configurable datacenter and worker IDs. The nextId method is synchronized to guarantee thread safety.

public class SnowFlake {

    /** Starting timestamp (can be set to a recent past time) */
    private static final long START_STAMP = 1480166465631L;

    /** Number of bits allocated to each part */
    private static final long SEQUENCE_BIT = 12;
    private static final long MACHINE_BIT = 5;
    private static final long DATA_CENTER_BIT = 5;

    /** Maximum values for each part */
    private static final long MAX_DATA_CENTER_NUM = ~(-1L << DATA_CENTER_BIT);
    private static final long MAX_MACHINE_NUM = ~(-1L << MACHINE_BIT);
    private static final long MAX_SEQUENCE = ~(-1L << SEQUENCE_BIT);

    /** Left shift values */
    private static final long MACHINE_LEFT = SEQUENCE_BIT;
    private static final long DATA_CENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
    private static final long TIMESTAMP_LEFT = DATA_CENTER_LEFT + DATA_CENTER_BIT;

    private final long dataCenterId;
    private final long machineId;
    private long sequence = 0L;
    private long lastStamp = -1L;

    public SnowFlake(long dataCenterId, long machineId) {
        if (dataCenterId > MAX_DATA_CENTER_NUM || dataCenterId < 0) {
            throw new IllegalArgumentException("dataCenterId can't be greater than MAX_DATA_CENTER_NUM or less than 0");
        }
        if (machineId > MAX_MACHINE_NUM || machineId < 0) {
            throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0");
        }
        this.dataCenterId = dataCenterId;
        this.machineId = machineId;
    }

    /** Generate the next ID */
    public synchronized long nextId() {
        long currStamp = getNewStamp();
        if (currStamp < lastStamp) {
            throw new RuntimeException("Clock moved backwards. Refusing to generate id");
        }
        if (currStamp == lastStamp) {
            // Same millisecond: increment sequence
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0L) {
                // Sequence overflow, wait for next millisecond
                currStamp = getNextMill();
            }
        } else {
            // New millisecond: reset sequence
            sequence = 0L;
        }
        lastStamp = currStamp;
        return (currStamp - START_STAMP) << TIMESTAMP_LEFT // timestamp part
                | dataCenterId << DATA_CENTER_LEFT   // datacenter part
                | machineId << MACHINE_LEFT         // machine part
                | sequence;                         // sequence part
    }

    private long getNextMill() {
        long mill = getNewStamp();
        while (mill <= lastStamp) {
            mill = getNewStamp();
        }
        return mill;
    }

    private long getNewStamp() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowFlake snowFlake = new SnowFlake(11, 11);
        long start = System.currentTimeMillis();
        for (int i = 0; i < 10; i++) {
            System.out.println(snowFlake.nextId());
        }
        System.out.println(System.currentTimeMillis() - start);
    }
}

The synchronized keyword ensures that concurrent threads within the same JVM cannot generate duplicate IDs.

Timestamp Limitation (41‑bit)

A 41‑bit timestamp can represent up to 2⁴¹‑1 milliseconds, which is roughly 69 years. The following snippet calculates the year span:

public static void main(String[] args) {
    String minTimeStampStr = "00000000000000000000000000000000000000000"; // 41‑bit zero
    String maxTimeStampStr = "11111111111111111111111111111111111111111"; // 41‑bit one
    long minTimeStamp = new BigInteger(minTimeStampStr, 2).longValue();
    long maxTimeStamp = new BigInteger(maxTimeStampStr, 2).longValue();
    long oneYearMills = 1L * 1000 * 60 * 60 * 24 * 365;
    System.out.println((maxTimeStamp - minTimeStamp) / oneYearMills); // prints 69
}

After the 69‑year window, a new datacenter or worker ID must be introduced to avoid collisions.

Front‑End Numeric Considerations

Snowflake IDs are 64‑bit integers. JavaScript’s Number type safely represents only up to 53 bits. Transmit IDs to the front end as strings to prevent overflow, and convert them back to numeric types only on the server side when necessary.

Conclusion

Snowflake provides a robust solution for generating globally unique, roughly ordered IDs with high throughput, suitable for most distributed applications. Developers should be aware of its reliance on the system clock (clock rollback can cause duplicates) and the 69‑year timestamp ceiling, as well as the need to handle 64‑bit values correctly in front‑end environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Java snowflake distributed-id unique identifier

Written by

Senior Brother's Insights

A public account focused on workplace, career growth, team management, and self-improvement. The author is the writer of books including 'SpringBoot Technology Insider' and 'Drools 8 Rule Engine: Core Technology and Practice'.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.