Backend Development 6 min read

Why Did My Snowflake IDs Collide? Lessons and Fixes for Distributed Systems

An unexpected primary-key duplicate error in a low-traffic Spring Cloud app revealed that multiple servers shared the same Snowflake workId, causing ID collisions; the article explains Snowflake's structure, its pros and cons, and offers three practical methods—including IP-based calculation, environment variables, and middleware—to ensure globally unique workIds.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Why Did My Snowflake IDs Collide? Lessons and Fixes for Distributed Systems

Background: A bizarre primary-key duplicate incident

One day the online logs occasionally reported “primary key duplicate”. The scenario seemed simple: an app continuously uploads data, with fewer than 10,000 users and very low concurrency, involving only single-table inserts.

However, this simple insert led the team into a deep investigation.

Problem identification

The project uses Spring Cloud + MybatisPlus, with Snowflake IDs as the default primary key. In production, a distributed cluster (machines A/B/C) was deployed without configuring a unique workId.

Conclusion: Multiple machines shared the same workId, causing Snowflake ID collisions.

Knowledge Card: What is a Snowflake ID?

Core principle

Snowflake is Twitter’s open-source distributed ID generation algorithm that creates a 64-bit long ID with the following structure:

0 | timestamp (41 bits) | datacenter ID (5 bits) | machine ID (5 bits) | sequence (12 bits)

Advantages

High performance: each node can generate over 260,000 IDs per second.

Monotonic increase: IDs are roughly ordered, suitable for database indexes.

Decentralized: no external service such as Redis or Zookeeper is required.

Critical drawbacks

Clock rollback: server time reversal can cause duplicate IDs.

Machine-ID conflict: identical workId in a distributed environment inevitably leads to ID duplication.

Non-continuous IDs: high concurrency may produce “gaps”.

Avoiding pitfalls: How to guarantee a globally unique workId?

Solution 1: IP-based dynamic calculation (recommended)

Derive workId from the last segment of the server’s IP address:

// Example: generate workId from IP
String hostAddress = InetAddress.getLocalHost().getHostAddress();
int ipLastSegment = Integer.parseInt(hostAddress.split("\\.")[3]);
return ipLastSegment % 32; // ensure workId is within 0-31

Pros: No manual intervention; IP is naturally unique.

Note: Ensure the IP’s last segment does not repeat; suitable for static IP environments.

Solution 2: Environment-variable injection (Docker-friendly)

Pass workId through startup commands:

# Docker run example
docker run -e WORKER_ID=2 -e DATACENTER_ID=1 your-service-image

Applicable scenario: Containerized deployments with flexible control.

Solution 3: Middleware hosting (high-availability)

Maintain a workId mapping table in Redis or a configuration center such as Nacos:

Key: service-name@ip → Value: workId

Pros: Suitable for dynamic scaling and avoids issues caused by IP changes.

Practical code: MybatisPlus dynamic workId configuration

@Configuration
public class MybatisPlusConfig {
    @Bean
    public IdentifierGenerator identifierGenerator() {
        return new DefaultIdentifierGenerator(getWorkerId(), getDatacenterId());
    }

    // Core logic: prioritize environment variable, then IP calculation
    private long getWorkerId() {
        try {
            String workerIdStr = System.getenv("WORKER_ID");
            if (workerIdStr != null) return Long.parseLong(workerIdStr);
            String hostAddress = InetAddress.getLocalHost().getHostAddress();
            int ipLastSegment = Integer.parseInt(hostAddress.split("\\.")[3]);
            return ipLastSegment % 32;
        } catch (Exception e) {
            log.error("Get workId failed, fallback to default 1");
            return 1L; // fallback
        }
    }
    // DataCenter ID logic omitted
}

Key points

Priority: environment variable > IP calculation > default fallback.

Log alerts are needed when IP retrieval fails and manual intervention is required.

Summary: Core principles of distributed ID design

Global uniqueness: workId must never repeat within the cluster.

Fault tolerance: strategies for clock rollback, IP changes, etc.

Observability: add logging at critical points such as workId generation.

Further thinking

If the service scale exceeds 32 nodes (the 5-bit workId limit), how can it be extended?

How to combine Leaf, UUID, and other schemes for multi-level disaster recovery?

distributed systemsSnowflake IDSpring CloudMybatisPlusID collisionworkId
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.