Backend Development 7 min read

Why Our Custom Snowflake ID Failed and How to Build Reliable IDs

A recent production incident revealed duplicate order IDs caused by a flawed custom Snowflake generator, prompting a deep dive into the standard algorithm, the mistakes in the bespoke version, and practical recommendations for using proven implementations and proper machine‑ID configuration.

Selected Java Interview Questions

Jul 1, 2025

Why Our Custom Snowflake ID Failed and How to Build Reliable IDs

Recently our online system suffered a serious incident: duplicate order/transaction IDs disrupted core business processes. The root cause was a self‑developed Snowflake ID generator.

1. Standard Snowflake Algorithm

The standard Snowflake ID is a 64‑bit integer composed of:

+----------------------------------------------------------------------------------------------------+
| 1 Bit | 41 Bits Timestamp | 5 Bits DataCenter ID | 5 Bits Machine ID | 12 Bits Sequence | 
+----------------------------------------------------------------------------------------------------+

1 Bit Sign: always 0 to ensure positive numbers.

41 Bits Timestamp: milliseconds since a custom epoch, supporting about 69 years.

10 Bits Machine ID: identifies different nodes.

12 Bits Sequence: allows up to 4096 IDs per millisecond.

Advantages: high‑performance, time‑ordered unique IDs suitable for distributed environments.

2. Our "Custom" Snowflake: What Went Wrong?

The custom structure we used was (based on investigation):

+----------------------------------------------------------------------------------------------------+
| 31 Bits Timestamp Delta | 13 Bits DataCenter ID | 4 Bits Work ID | 8 Bits Business ID | 8 Bits Sequence |
+----------------------------------------------------------------------------------------------------+

Although it appears richer, it has critical flaws:

1. Timestamp only 31 bits – supports at most 24.85 days

After left‑shifting 33 bits, only 31 bits remain for the timestamp.

When the millisecond count exceeds 2^31, the value wraps.

Our custom epoch started in 2018, so by 2025 the timestamp had already cycled many times.

2. Business ID uses the last octet of the IP

Using the final segment of an IP (e.g., the "1" in 192.168.0.1) leads to easy collisions.

3. Work ID and DataCenter ID were left at 0

All instances shared the same node identifiers, nullifying uniqueness.

Result: time wrap‑around + IP conflict + sequence duplication caused complete ID collisions.

3. Lessons Learned

Do not reinvent generic components

Snowflake involves clock rollback handling, bit manipulation, and distributed coordination; mature libraries are more reliable.

Never trust third‑party packages blindly

Always review the implementation and understand how uniqueness is guaranteed.

Configure machine IDs properly

Using only the IP suffix is fragile; plan and assign Worker ID and DataCenter ID centrally.

Cover edge cases early

Simulate long‑running, sequence overflow, and clock‑backward scenarios to ensure robustness.

4. Recommended Practices

Adopt proven open‑source implementations such as Hutool or Baomidou:

// Hutool example
Snowflake snowflake = IdUtil.getSnowflake(1, 1);
long id = snowflake.nextId();

// Baomidou example (supports automatic IP/MAC derivation or manual setting)
DefaultIdentifierGenerator generator = new DefaultIdentifierGenerator(1, 1); // workerId=1, dataCenterId=1
long id = generator.nextId("user");

For medium‑to‑large systems, DataCenter ID typically identifies different data centers or availability zones.

Worker ID assignment strategies can evolve with system scale:

Simple: manually set in configuration files – easy for development or single‑node deployments.

Standard: hash IP + port (or process ID) and take modulo of the Worker ID range – works for small to medium deployments without external services.

Intermediate: use a service registry (e.g., Eureka, Nacos) to allocate IDs during registration.

Advanced: employ centralized coordinators like Redis or Zookeeper for dynamic allocation and release, supporting horizontal scaling.

As the system grows, gradually adopt more sophisticated mechanisms to avoid over‑design early on.

5. Additional Advice: Keep Business Flags Out of IDs

Embedding business information (type prefixes, module codes) into IDs makes them non‑numeric, breaks time‑ordered sorting, inflates storage, and creates compatibility issues when business rules change. Store such metadata separately; let the ID remain a pure unique identifier.

6. Conclusion

Do not reinvent the wheel for common components; rely on battle‑tested libraries and understand their internals to prevent costly failures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Backend Development Snowflake ID Generation

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.