Why Your Snowflake‑Like ID Generator May Duplicate IDs Under High Concurrency

The article analyzes a custom TraceIdGenerator that uses a simple AtomicInteger counter and IP‑based prefix, identifying how its reset logic and CAS competition can cause duplicate IDs in high‑concurrency scenarios, and proposes timestamp checks, larger ranges, retry mechanisms, and true Snowflake implementation as solutions.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Why Your Snowflake‑Like ID Generator May Duplicate IDs Under High Concurrency

In a project, a log trace ID is generated using a custom TraceIdGenerator that mimics Snowflake by concatenating the machine IP, current timestamp, and a sequence number produced by AtomicInteger. During a code review, concerns were raised that the sequence may repeat under high concurrency.

Code Overview

public class TraceIdGenerator {
    // ... methods to obtain IP and PID omitted for brevity ...
    private static String IP_16 = "ffffffff";
    private static AtomicInteger count = new AtomicInteger(1000);
    static {
        try {
            String ipAddress = getInetAddress();
            if (ipAddress != null) {
                IP_16 = getIP_16(ipAddress);
            }
        } catch (Throwable e) { }
    }
    private static String getTraceId(String ip, long timestamp, int nextId) {
        return new StringBuilder(30).append(ip).append(timestamp).append(nextId).append(getPID()).toString();
    }
    public static String generate() {
        try {
            return getTraceId(IP_16, System.currentTimeMillis(), getNextId());
        } catch (Throwable e) {
            return UUID.fastUUID().toString();
        }
    }
    private static int getNextId() {
        for (;;) {
            int current = count.get();
            int next = (current > 9000) ? 1000 : current + 1;
            if (count.compareAndSet(current, next)) {
                return next;
            }
        }
    }
}

Root Causes of Duplicate IDs

1. Reset Mechanism

The counter cycles between 1000 and 9000. When count reaches 9000, multiple threads may simultaneously read the value, compute next = 1000, and each succeed in compareAndSet, producing identical IDs.

2. CAS Competition

Although AtomicInteger provides atomic updates, it does not guarantee uniqueness when several threads read the same current before any update occurs. Only one thread will win the CAS, but others may retry and still obtain the same next value after the reset.

3. Time‑Window Collisions

In the same millisecond, many threads can invoke getNextId(). The simple reset logic does not coordinate across the time window, so the same sequence number can be emitted multiple times.

Improvement Strategies

1. Timestamp‑Based Reset

Introduce a lastTimestamp field and reset the counter only when the millisecond changes, guarded by a synchronized block.

private static long lastTimestamp = -1L;
private static AtomicInteger count = new AtomicInteger(1000);
private static int getNextId() {
    long currentTime = System.currentTimeMillis();
    if (currentTime != lastTimestamp) {
        synchronized (TraceIdGenerator.class) {
            if (currentTime != lastTimestamp) {
                count.set(1000);
                lastTimestamp = currentTime;
            }
        }
    }
    for (;;) {
        int current = count.get();
        int next = (current > 9000) ? 1000 : current + 1;
        if (count.compareAndSet(current, next)) {
            return next;
        }
    }
}

2. Expand the Counter Range

Use a larger range (e.g., 0‑100,000) to reduce the probability of exhausting the sequence within a millisecond.

3. Time‑Window Retry

If the counter is exhausted in the current millisecond, pause the thread until the next millisecond before generating a new ID.

4. Adopt a Real Snowflake Algorithm

For distributed systems, switch to a full Snowflake implementation that incorporates machine ID, data‑center ID, a 41‑bit timestamp, and a 12‑bit sequence, guaranteeing global uniqueness without manual resets.

Comparison with Snowflake

Similarity: Both use a timestamp and an incrementing sequence.

Differences:

Snowflake adds machine and data‑center identifiers for distributed uniqueness.

Snowflake’s 41‑bit timestamp spans ~69 years, while the custom code relies on System.currentTimeMillis() without such range control.

Snowflake resets the sequence only after the millisecond advances, avoiding the reset‑to‑1000 bug.

Snowflake produces a 64‑bit structured ID; the custom generator simply concatenates strings, limiting flexibility.

Conclusion

The presented generator borrows ideas from Snowflake but lacks critical mechanisms such as machine IDs, proper timestamp handling, and safe sequence reset, making it vulnerable to duplicate IDs in multi‑threaded or distributed environments. Applying the suggested improvements—or switching to a proven Snowflake library—will eliminate the duplication risk.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaHigh ConcurrencyID generationSnowflake algorithmAtomicInteger
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.