How Snowflake Generates Globally Unique IDs: Deep Dive and Java Implementation
This article explains Twitter's Snowflake algorithm, detailing its 64‑bit structure, the role of each bit segment, a step‑by‑step Java implementation, and the algorithm's performance advantages and practical limitations in distributed backend systems.
Overview of Snowflake
Snowflake is a distributed ID generation algorithm originally open‑sourced by Twitter. It creates a 64‑bit long integer that is globally unique across a distributed system by embedding a timestamp, datacenter ID, machine ID, and a per‑millisecond sequence number.
Bit Allocation
The 64 bits are divided as follows:
1 unused bit (always 0) to keep the ID positive.
41 bits for the timestamp in milliseconds, allowing roughly 69 years of unique timestamps.
5 bits for the datacenter (or “machine room”) ID, supporting up to 32 datacenters.
5 bits for the machine ID within a datacenter, supporting up to 32 machines per datacenter.
12 bits for a sequence number, enabling up to 4096 IDs to be generated within the same millisecond on a single machine.
When combined, these fields produce a monotonically increasing ID that can be sorted by creation time.
Java Implementation
The following Java class implements the Snowflake algorithm. It validates the datacenter and machine IDs, handles clock rollback, and ensures uniqueness even when many IDs are generated in the same millisecond.
public class IdWorker {
// 1 unused bit, always 0
private long workerId; // 5 bits
private long datacenterId; // 5 bits
private long sequence; // 12 bits
private long twepoch = 1585644268888L; // custom epoch
private long workerIdBits = 5L;
private long datacenterIdBits = 5L;
private long sequenceBits = 12L;
private long maxWorkerId = -1L ^ (-1L << workerIdBits);
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private long workerIdShift = sequenceBits;
private long datacenterIdShift = sequenceBits + workerIdBits;
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private long sequenceMask = -1L ^ (-1L << sequenceBits);
private long lastTimestamp = -1L;
public IdWorker(long workerId, long datacenterId, long sequence) {
if (workerId > maxWorkerId || workerId < 0) {
throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
}
if (datacenterId > maxDatacenterId || datacenterId < 0) {
throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
}
this.workerId = workerId;
this.datacenterId = datacenterId;
this.sequence = sequence;
}
public synchronized long nextId() {
long timestamp = timeGen();
if (timestamp < lastTimestamp) {
System.err.printf("clock is moving backwards. Rejecting requests until %d.", lastTimestamp);
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
}
if (lastTimestamp == timestamp) {
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0;
}
lastTimestamp = timestamp;
return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) | sequence;
}
private long tilNextMillis(long lastTimestamp) {
long timestamp = timeGen();
while (timestamp <= lastTimestamp) {
timestamp = timeGen();
}
return timestamp;
}
private long timeGen() {
return System.currentTimeMillis();
}
public static void main(String[] args) {
// Example usage:
// IdWorker worker = new IdWorker(1, 1, 0);
// System.out.println(worker.nextId());
}
}Advantages
High performance and availability: IDs are generated entirely in memory without database calls.
Large capacity: Up to millions of IDs can be produced per second.
Monotonic and sortable: IDs increase over time, which improves indexing in databases.
Limitations
The algorithm depends on synchronized system clocks; if a server’s clock moves backward, duplicate IDs may occur. In practice, the number of datacenters and machines is often far less than the theoretical limits, so the bit allocation can be adjusted to better fit specific business needs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
