Why Our Custom Snowflake ID Generator Failed and How to Build a Reliable One
A recent production outage caused by duplicate order IDs revealed critical flaws in a home‑grown Snowflake‑style ID generator, prompting a detailed review of the standard algorithm, an analysis of the custom implementation’s mistakes, and a set of best‑practice recommendations for safe ID generation in distributed systems.
1. Standard Snowflake Algorithm (Snowflake)
The classic Snowflake ID is a 64‑bit integer composed of:
+----------------------------------------------------------------------------------------------------+
| 1 Bit | 41 Bits Timestamp | 5 Bits DataCenter ID | 5 Bits Machine ID | 12 Bits Sequence |
+----------------------------------------------------------------------------------------------------+1 Bit Sign: Always 0 to ensure a positive number.
41 Bits Timestamp: Millisecond offset from a fixed epoch, supporting about 69 years.
5 Bits DataCenter ID: Distinguishes different data centers.
5 Bits Machine ID: Identifies individual nodes.
12 Bits Sequence: Allows up to 4096 IDs per millisecond on the same node.
Advantages: high‑performance, time‑ordered, globally unique IDs suitable for distributed environments.
2. Our "Custom" Snowflake Implementation: What Went Wrong?
The in‑house two‑party package used the following layout (inferred from debugging):
+----------------------------------------------------------------------------------------------------+
| 31 Bits TimestampDelta | 13 Bits DataCenter ID | 4 Bits Work ID | 8 Bits Business ID | 8 Bits Sequence |
+----------------------------------------------------------------------------------------------------+2.1 Timestamp Only 31 Bits – Supports Just 24.85 Days
After left‑shifting 33 bits, only 31 bits remain for the timestamp.
When the millisecond count exceeds 2^31, the timestamp wraps around.
With a custom epoch starting in 2018, the wrap‑around already occurred many times by 2025.
2.2 Business ID Derived from the Last Octet of the IP
Using the final segment of an IPv4 address (e.g., the "1" in 192.168.0.1) makes collisions extremely likely across machines.
2.3 Work ID and DataCenter ID Left at Zero
All instances shared the same node identifiers, effectively nullifying the uniqueness guarantees.
Result: Timestamp overflow + IP‑based conflicts + sequence reuse caused complete ID collisions.
3. Lessons Learned
Avoid reinventing well‑tested components: Clock rollback, bit‑wise operations, and distributed coordination are error‑prone; mature libraries are safer.
Never trust third‑party packages blindly: Review the implementation and understand its uniqueness guarantees.
Configure Machine IDs Properly: Do not rely on fragile IP suffixes; allocate Worker ID and DataCenter ID centrally.
Test Edge Cases Early: Simulate long‑running operation, sequence overflow, and clock rollback to ensure robustness.
4. Recommended Practices
Adopt proven open‑source implementations such as Hutool or Baomidou:
// Hutool example
Snowflake snowflake = IdUtil.getSnowflake(1, 1);
long id = snowflake.nextId();
// Baomidou example (supports automatic IP/MAC derivation or manual configuration)
DefaultIdentifierGenerator generator = new DefaultIdentifierGenerator(1, 1); // workerId=1, dataCenterId=1
long id = generator.nextId("user");For medium‑to‑large systems, use DataCenter ID to represent different data‑center or availability zone.
Worker ID assignment strategies can evolve with system scale:
Simple: Manually set via configuration files – suitable for development or single‑node deployments.
Standard: Hash the concatenation of IP and port (or process ID) and take modulo of the total Worker ID count – no external dependencies.
Intermediate: Leverage a service registry (e.g., Eureka, Nacos) to allocate IDs during service registration.
Advanced: Use centralized coordinators like Redis or Zookeeper for dynamic allocation, release, and conflict avoidance.
Gradually introduce more sophisticated mechanisms as the system grows, avoiding premature over‑engineering.
5. Other Advice: Do Not Embed Business Information in IDs
Appending business prefixes or module codes to IDs leads to several problems:
IDs become non‑numeric, losing natural time‑order and harming database index performance.
Variable length or overly long IDs increase storage costs and degrade log readability.
Changing business semantics later can break compatibility.
Prefer storing business attributes separately; let the ID remain a pure, unique, sortable identifier.
6. Conclusion
Do not reinvent generic components without thorough understanding; rely on battle‑tested libraries and follow disciplined ID design to prevent costly production incidents.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
