Choosing the Right Unique ID Strategy: From Auto‑Increment to Snowflake
This article reviews common system‑wide unique ID generation techniques—including database auto‑increment, UUIDs, Redis counters, Twitter's Snowflake, Zookeeper sequences, and MongoDB ObjectId—detailing their advantages, drawbacks, optimization tips, and providing C# code examples for implementation.
1. Database Auto‑Increment Sequence or Field
The most common method uses the database’s auto‑increment feature, guaranteeing uniqueness across the whole database.
Advantages :
Simple, easy to code, acceptable performance.
Numeric IDs are naturally ordered, which helps pagination and sorted results.
Disadvantages :
Different databases have different syntax; migrations or multi‑database support require extra handling.
In a single‑database or master‑slave setup only the master can generate IDs, creating a single‑point‑of‑failure risk.
Scaling performance can be difficult when requirements exceed the database’s capabilities.
Merging systems or data migration becomes painful.
Sharding or partitioning adds complexity.
Optimization :
When multiple master databases exist, assign each master a distinct start value and the same step size equal to the number of masters (e.g., Master 1 generates 1, 4, 7…, Master 2 generates 2, 5, 8…, etc.). This yields unique IDs across the cluster and reduces load on each database.
2. UUID
A widely used method that can be generated by the database or the application and is globally unique.
Advantages :
Simple and easy to code.
Very good generation performance, rarely a bottleneck.
Globally unique, which simplifies data migration, system merging, or database changes.
Disadvantages :
No natural ordering; cannot guarantee monotonic increase.
Usually stored as strings, leading to slower query performance.
Large storage footprint; may be problematic for massive datasets.
Increases data transmission size.
Not human‑readable.
3. UUID Variants
To improve readability, a UUID can be converted to a 64‑bit integer:
/// <summary>
/// Convert GUID to a unique numeric sequence
/// </summary>
public static long GuidToInt64()
{
byte[] bytes = Guid.NewGuid().ToByteArray();
return BitConverter.ToInt64(bytes, 0);
}The COMB algorithm (combined GUID/timestamp) retains 10 bytes of the GUID and uses the remaining 6 bytes to store the generation timestamp, providing ordered GUIDs:
/// <summary>
/// Generate a new Guid using the COMB algorithm.
/// </summary>
private Guid GenerateComb()
{
byte[] guidArray = Guid.NewGuid().ToByteArray();
DateTime baseDate = new DateTime(1900, 1, 1);
DateTime now = DateTime.Now;
TimeSpan days = new TimeSpan(now.Ticks - baseDate.Ticks);
TimeSpan msecs = now.TimeOfDay;
byte[] daysArray = BitConverter.GetBytes(days.Days);
byte[] msecsArray = BitConverter.GetBytes((long)(msecs.TotalMilliseconds / 3.333333));
Array.Reverse(daysArray);
Array.Reverse(msecsArray);
Array.Copy(daysArray, daysArray.Length - 2, guidArray, guidArray.Length - 6, 2);
Array.Copy(msecsArray, msecsArray.Length - 4, guidArray, guidArray.Length - 4, 4);
return new Guid(guidArray);
}Testing shows the COMB‑generated IDs are ordered by time, while plain GUIDs are not.
4. Redis ID Generation
If database performance is insufficient, Redis can generate IDs using its atomic INCR or INCRBY commands. A Redis cluster can provide higher throughput; each node can be initialized with a different start value and the same step size (e.g., 5 nodes with start values 1‑5 and step 5 produce distinct sequences).
Advantages :
Independent of the database, flexible, and generally faster.
Numeric IDs are naturally ordered.
Disadvantages :
Introducing Redis adds a new component and increases system complexity.
Configuration and coding effort are relatively large.
5. Twitter Snowflake Algorithm
Snowflake is an open‑source distributed ID generator from Twitter that produces a 64‑bit long. Its layout is:
41 bits for timestamp (milliseconds since a custom epoch).
10 bits for machine identifier (5 bits data center, 5 bits worker).
12 bits for a per‑millisecond sequence (up to 4096 IDs per node per ms).
1 sign bit (always 0).
C# implementation:
public class IdWorker
{
private long workerId;
private long datacenterId;
private long sequence = 0L;
private static long twepoch = 1288834974657L;
private static long workerIdBits = 5L;
private static long datacenterIdBits = 5L;
private static long maxWorkerId = -1L ^ (-1L << (int)workerIdBits);
private static long maxDatacenterId = -1L ^ (-1L << (int)datacenterIdBits);
private static long sequenceBits = 12L;
private static long workerIdShift = sequenceBits;
private static long datacenterIdShift = sequenceBits + workerIdBits;
private static long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private static long sequenceMask = -1L ^ (-1L << (int)sequenceBits);
private long lastTimestamp = -1L;
private static readonly object syncRoot = new object();
public IdWorker(long workerId, long datacenterId)
{
if (workerId > maxWorkerId || workerId < 0)
throw new ArgumentException(string.Format("worker Id can't be greater than %d or less than 0", maxWorkerId));
if (datacenterId > maxDatacenterId || datacenterId < 0)
throw new ArgumentException(string.Format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
this.workerId = workerId;
this.datacenterId = datacenterId;
}
public long nextId()
{
lock (syncRoot)
{
long timestamp = timeGen();
if (timestamp < lastTimestamp)
throw new ApplicationException(string.Format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp));
if (lastTimestamp == timestamp)
{
sequence = (sequence + 1) & sequenceMask;
if (sequence == 0)
timestamp = tilNextMillis(lastTimestamp);
}
else
sequence = 0L;
lastTimestamp = timestamp;
return ((timestamp - twepoch) << (int)timestampLeftShift) |
(datacenterId << (int)datacenterIdShift) |
(workerId << (int)workerIdShift) |
sequence;
}
}
protected long tilNextMillis(long lastTimestamp)
{
long timestamp = timeGen();
while (timestamp <= lastTimestamp)
timestamp = timeGen();
return timestamp;
}
protected long timeGen()
{
return (long)(DateTime.UtcNow - new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc)).TotalMilliseconds;
}
}Test code runs two workers in parallel and checks for duplicate IDs:
private static void TestIdWorker()
{
HashSet<long> set = new HashSet<long>();
IdWorker idWorker1 = new IdWorker(0, 0);
IdWorker idWorker2 = new IdWorker(1, 0);
Thread t1 = new Thread(() => DoTestIdWorker(idWorker1, set));
Thread t2 = new Thread(() => DoTestIdWorker(idWorker2, set));
t1.IsBackground = true;
t2.IsBackground = true;
t1.Start();
t2.Start();
Thread.Sleep(30000);
t1.Abort();
t2.Abort();
Console.WriteLine("done");
}
private static void DoTestIdWorker(IdWorker idWorker, HashSet<long> set)
{
while (true)
{
long id = idWorker.nextId();
if (!set.Add(id))
Console.WriteLine("duplicate:" + id);
Thread.Sleep(1);
}
}Advantages: no database dependency, high performance, IDs increase over time on a single node.
Disadvantages: clock synchronization issues can break global monotonicity.
6. Zookeeper‑Based Unique ID
Zookeeper can generate sequential numbers using the version of a znode, yielding 32‑bit or 64‑bit sequence numbers. It is rarely used because it requires a Zookeeper cluster and multi‑step API calls, which add latency and complexity in high‑concurrency scenarios.
7. MongoDB ObjectId
MongoDB’s ObjectId is a 12‑byte identifier similar to Snowflake. Its layout:
First 4 bytes: seconds since the Unix epoch (timestamp).
Next 5 bytes: machine identifier (usually a hash of the host name).
Next 2 bytes: process identifier (PID).
Last 3 bytes: an incrementing counter.
The timestamp at the front makes ObjectIds roughly sortable by creation time, which is useful for indexing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
