Why Your Snowflake‑Like ID Generator May Duplicate IDs Under High Concurrency
The article analyzes a custom TraceIdGenerator that uses a simple AtomicInteger counter and IP‑based prefix, identifying how its reset logic and CAS competition can cause duplicate IDs in high‑concurrency scenarios, and proposes timestamp checks, larger ranges, retry mechanisms, and true Snowflake implementation as solutions.
In a project, a log trace ID is generated using a custom TraceIdGenerator that mimics Snowflake by concatenating the machine IP, current timestamp, and a sequence number produced by AtomicInteger. During a code review, concerns were raised that the sequence may repeat under high concurrency.
Code Overview
public class TraceIdGenerator {
// ... methods to obtain IP and PID omitted for brevity ...
private static String IP_16 = "ffffffff";
private static AtomicInteger count = new AtomicInteger(1000);
static {
try {
String ipAddress = getInetAddress();
if (ipAddress != null) {
IP_16 = getIP_16(ipAddress);
}
} catch (Throwable e) { }
}
private static String getTraceId(String ip, long timestamp, int nextId) {
return new StringBuilder(30).append(ip).append(timestamp).append(nextId).append(getPID()).toString();
}
public static String generate() {
try {
return getTraceId(IP_16, System.currentTimeMillis(), getNextId());
} catch (Throwable e) {
return UUID.fastUUID().toString();
}
}
private static int getNextId() {
for (;;) {
int current = count.get();
int next = (current > 9000) ? 1000 : current + 1;
if (count.compareAndSet(current, next)) {
return next;
}
}
}
}Root Causes of Duplicate IDs
1. Reset Mechanism
The counter cycles between 1000 and 9000. When count reaches 9000, multiple threads may simultaneously read the value, compute next = 1000, and each succeed in compareAndSet, producing identical IDs.
2. CAS Competition
Although AtomicInteger provides atomic updates, it does not guarantee uniqueness when several threads read the same current before any update occurs. Only one thread will win the CAS, but others may retry and still obtain the same next value after the reset.
3. Time‑Window Collisions
In the same millisecond, many threads can invoke getNextId(). The simple reset logic does not coordinate across the time window, so the same sequence number can be emitted multiple times.
Improvement Strategies
1. Timestamp‑Based Reset
Introduce a lastTimestamp field and reset the counter only when the millisecond changes, guarded by a synchronized block.
private static long lastTimestamp = -1L;
private static AtomicInteger count = new AtomicInteger(1000);
private static int getNextId() {
long currentTime = System.currentTimeMillis();
if (currentTime != lastTimestamp) {
synchronized (TraceIdGenerator.class) {
if (currentTime != lastTimestamp) {
count.set(1000);
lastTimestamp = currentTime;
}
}
}
for (;;) {
int current = count.get();
int next = (current > 9000) ? 1000 : current + 1;
if (count.compareAndSet(current, next)) {
return next;
}
}
}2. Expand the Counter Range
Use a larger range (e.g., 0‑100,000) to reduce the probability of exhausting the sequence within a millisecond.
3. Time‑Window Retry
If the counter is exhausted in the current millisecond, pause the thread until the next millisecond before generating a new ID.
4. Adopt a Real Snowflake Algorithm
For distributed systems, switch to a full Snowflake implementation that incorporates machine ID, data‑center ID, a 41‑bit timestamp, and a 12‑bit sequence, guaranteeing global uniqueness without manual resets.
Comparison with Snowflake
Similarity: Both use a timestamp and an incrementing sequence.
Differences:
Snowflake adds machine and data‑center identifiers for distributed uniqueness.
Snowflake’s 41‑bit timestamp spans ~69 years, while the custom code relies on System.currentTimeMillis() without such range control.
Snowflake resets the sequence only after the millisecond advances, avoiding the reset‑to‑1000 bug.
Snowflake produces a 64‑bit structured ID; the custom generator simply concatenates strings, limiting flexibility.
Conclusion
The presented generator borrows ideas from Snowflake but lacks critical mechanisms such as machine IDs, proper timestamp handling, and safe sequence reset, making it vulnerable to duplicate IDs in multi‑threaded or distributed environments. Applying the suggested improvements—or switching to a proven Snowflake library—will eliminate the duplication risk.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
