Why Our Custom Snowflake ID Collided and How to Build a Reliable Generator
A severe production incident caused duplicate order IDs due to a flawed custom Snowflake implementation, prompting a detailed review of the standard algorithm, identification of critical design mistakes, and a set of practical recommendations for safe, scalable ID generation in distributed back‑end systems.
Standard Snowflake Algorithm
The original Snowflake ID is a 64‑bit signed integer composed of:
+-----------------------------------------------+
| 1 Bit | 41 Bits Timestamp | 5 Bits DataCenter | 5 Bits Machine | 12 Bits Sequence |
+-----------------------------------------------+Sign bit : always 0 to ensure a positive number.
41‑bit timestamp : milliseconds elapsed since a custom epoch, supporting about 69 years.
10‑bit machine identifier : 5 bits for data‑center ID and 5 bits for worker ID.
12‑bit sequence : allows up to 4096 IDs per millisecond on the same node.
Advantages : high performance, time‑ordered, and well‑suited for distributed systems.
Custom Snowflake Variant and Its Problems
In a proprietary two‑party package the ID layout was observed as:
+---------------------------------------------------------------+
| 31 Bits TimestampDelta | 13 Bits DataCenterID | 4 Bits WorkID |
| 8 Bits BusinessID | 8 Bits Sequence |
+---------------------------------------------------------------+Problem 1 – Timestamp limited to 31 bits
Supports only ~24.85 days before the timestamp wraps.
After 2³¹ milliseconds the timestamp rolls over, causing collisions.
With a custom epoch starting in 2018, the wrap already occurred by 2025.
Problem 2 – Business ID derived from IP suffix
Using the last octet of an IPv4 address (e.g., "1" from 192.168.0.1) leads to a high risk of duplicate IDs across machines.
Problem 3 – WorkId and DataCenterId left at zero
All instances share the same node identifiers, effectively nullifying the uniqueness guarantees of the algorithm.
Lessons Learned
Avoid reinventing common components : Snowflake involves clock‑backward handling, bit manipulation, and distributed coordination; mature libraries are far safer.
Never trust third‑party packages blindly : Review the implementation and understand how uniqueness is guaranteed.
Configure machine IDs properly : Allocate worker and data‑center IDs centrally instead of deriving them from fragile IP suffixes.
Test edge cases : Simulate long‑running scenarios, sequence overflow, and clock rollback to ensure robustness.
Recommended Practices
Use proven open‑source implementations such as Hutool or Baomidou:
// Hutool example
Snowflake snowflake = IdUtil.getSnowflake(1, 1);
long id = snowflake.nextId();
// Baomidou example (supports automatic IP/MAC derivation or manual configuration)
DefaultIdentifierGenerator generator = new DefaultIdentifierGenerator(1, 1); // workerId=1, dataCenterId=1
long id = generator.nextId("user");For medium to large systems, treat DataCenterId as the identifier of a data‑center or availability zone.
Simple approach : Manually set IDs via configuration files – suitable for development or single‑node deployments.
Standard approach : Hash the concatenation of IP and port (or process ID) and take the modulus of the total worker count – works without external services.
Intermediate approach : Use a service registry (e.g., Eureka, Nacos) to assign IDs during registration, coupling the ID with the service instance.
Advanced approach : Employ a coordination service like Redis or Zookeeper to allocate and recycle worker IDs dynamically, supporting scaling and conflict avoidance.
Additional Advice: Keep Business Data Separate from IDs
Embedding business semantics (prefixes, module codes) into IDs introduces several problems:
IDs become non‑numeric, losing time‑ordered sorting and hurting index performance.
Variable or excessive length increases storage cost and degrades log readability.
Changing business meanings can break compatibility.
Store business attributes in separate columns and let the ID remain a pure, monotonic identifier.
Conclusion
Do not reinvent well‑tested components like Snowflake; rely on battle‑tested libraries and understand their configuration to prevent ID collisions in production systems.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
