Distributed ID Generation Schemes and the rpcxio/did Service
This article reviews common ID generation strategies—including UUID/GUID, auto‑increment integers, random numbers, short strings, Twitter's Snowflake, and MongoDB ObjectID—compares their advantages and drawbacks, and introduces the rpcxio/did distributed ID service with performance benchmarks.
In modern computing, unique identifiers (IDs) are essential for locating objects, establishing relationships, and tracking entities across services; examples include national ID cards, order numbers, product SKUs, and social media IDs.
The article first explains UUID/GUID, a 128‑bit identifier standardized by the OSF, detailing its format (8‑4‑4‑4‑12 hexadecimal groups), versioning (1‑5), and the fact that version 1 uses time and MAC address while version 4 is random. Advantages are ease of implementation, near‑zero collision probability, and decentralised generation; disadvantages include poor readability, 16‑byte storage overhead, and potential performance impact on databases.
It then discusses simple auto‑increment integer IDs, typically provided by relational databases (e.g., MySQL) or NoSQL stores (e.g., Redis). These IDs are easy to generate, human‑readable, and compact (4 bytes), but require a centralised service, expose ordering information, and can become a bottleneck.
Random‑based approaches are introduced next, such as using pseudo‑random numbers, the skip32 algorithm, and hashids (e.g., converting 347 to "yr8"). These methods improve readability and conceal information, yet still need a central service and a two‑step process (increment then encrypt).
Short random strings (e.g., base‑62 or Base58) are also covered, highlighting their compactness (5 characters can represent a billion IDs) and readability, while noting the need for collision detection.
The widely‑adopted Twitter Snowflake algorithm is described: a 64‑bit ID composed of a reserved sign bit, 41‑bit timestamp (≈69 years), 10‑bit machine identifier (allowing up to 1024 nodes), and 12‑bit sequence (4096 IDs per millisecond). Benefits include small storage (8 bytes), time‑ordered IDs, and high throughput; drawbacks are vulnerability to clock rollback and information leakage.
MongoDB ObjectID is presented as another 12‑byte scheme encoding a 4‑byte timestamp, 3‑byte machine ID, 2‑byte process ID, and 3‑byte increment. It offers good readability and performance but consumes more space than Snowflake and shares similar rollback risks.
Finally, the article introduces the open‑source distributed ID generator rpcxio/did , which builds on Snowflake while allowing custom bit allocations for worker IDs and sequences. It operates as a centralised service that supports batch ID retrieval to reduce network overhead, can be clustered for fault tolerance, and requires periodic time‑server synchronization to avoid duplicate IDs.
Performance tests show that a single node can generate 120 k IDs/second when requesting one ID at a time, and up to 2.97 M IDs/second when fetching batches of 100 IDs. Example benchmark commands and results are shown below:
1、256个client并发,每次只获取1个ID, ID的产生速度是12万个ID/秒。 ./bclient -addr 192.168.15.225:8972 -n 100000 total IDs: 25600000, duration: 3m31.581592489s, id/s: 120993 2、如果采用批量获取,尽量减少网络消耗,256个client并发,每次只获取100个ID, ID的产生速度是297万个ID/秒。 ./bclient -addr 192.168.15.225:8972 -n 1000000 -b 100 total IDs: 256000000, duration: 1m26.178942509s, id/s: 2970563Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
