Mastering Distributed ID Generation: Snowflake, Custom ID Generators, and Base62 Conversion

This article explores the challenges of generating globally unique, trend‑ordered IDs in distributed systems, compares database auto‑increment, UUID and ID‑grouping approaches, explains Twitter's Snowflake algorithm, provides a full Java implementation with Base62 conversion utilities, and introduces the Vesta ID‑generator framework.

Java Backend Technology
Java Backend Technology
Java Backend Technology
Mastering Distributed ID Generation: Snowflake, Custom ID Generators, and Base62 Conversion

Introduction

In the previous article we discussed converting a long URL to a short one by using an ID generator that produces a unique integer and then converting it to base‑62. This article dives deeper into what an ID generator is, its principles, and how to implement one.

1. From Database Primary Key

1.1 Single‑node database

When traffic is low a single database server can satisfy the demand. The primary key is usually a BIGINT with UNSIGNED AUTO_INCREMENT. This guarantees uniqueness, monotonic increase, and a fixed step size.

Single‑node auto‑increment schema
Single‑node auto‑increment schema

However, when the system scales and we need sharding or multiple databases, this approach fails because each shard would generate overlapping IDs.

Imagine each province maintains its own database with a User table using auto‑increment IDs starting from 1. Merging all provinces into a central database would cause primary‑key collisions.

1.2 Database cluster / sharding

When a table is split across multiple machines, the auto‑increment feature can no longer guarantee global uniqueness. The following diagram shows a User table with 1 million rows distributed over two databases; each database has its own auto‑increment IDs, but there is no global ordering.

Sharding example
Sharding example

To solve this we consider several alternatives:

Use UUID – globally unique but not ordered, large, and indexes become inefficient.

ID grouping – assign each database a distinct auto_increment start value and step, preserving uniqueness but losing absolute monotonicity and requiring manual updates when adding nodes.

2. Snowflake Overview

Twitter's Snowflake algorithm generates 64‑bit IDs composed of:

1 bit sign (always 0 for positive IDs)

41 bits timestamp (millisecond precision, offset from a custom epoch, lasting ~69 years)

10 bits node identifier (5 bits data‑center, 5 bits machine, supporting up to 1024 nodes)

12 bits sequence number (up to 4096 IDs per millisecond per node)

The IDs are roughly time‑ordered, and uniqueness is ensured by the combination of timestamp, data‑center, machine, and sequence.

2.1 Snowflake Implementation

public class SnowFlake {
    private static final long START_TIMESTAMP = 1480166465631L;
    private static final long SEQUENCE_BIT = 12L;
    private static final long MACHINE_BIT = 5L;
    private static final long DATA_CENTER_BIT = 5L;
    // max values, left shifts, etc.
    // constructor validates dataCenterId and machineId
    // nextId() generates the 64‑bit ID
    // getNextMill() and getNewTimeStamp() handle clock moves
}

The implementation can be adapted: you may allocate fewer bits to the data‑center if not needed, or use all 10 bits for the machine identifier.

3. Implementing a Custom ID Generator

The following class combines Snowflake ID generation with a Base‑62 conversion to produce short URLs.

public class SnowFlakeShortUrl {
    // same constants as SnowFlake
    // constructor takes dataCenterId and machineId
    // nextId() returns a long ID
    // main() demonstrates generating IDs and converting them
}

Sample output (decimal → base‑62):

10进制:185894506410029056  62进制短地址:dJoJ1Xyo3C
62进制短地址:dJoJ1Xyo3C  10进制:185894506410029056
...

4. Base‑62 Conversion Utility

The NumericConvertUtils class provides two static methods: toOtherNumberSystem(long number, int seed) – converts a decimal number to the specified base (up to 62). toDecimalNumber(String number, int seed) – converts a string in the given base back to decimal.

public class NumericConvertUtils {
    private static final char[] digits = {
        '0','1','2','3','4','5','6','7','8','9',
        'a','b','c','d','e','f','g','h','i','j','k','l','m',
        'n','o','p','q','r','s','t','u','v','w','x','y','z',
        'A','B','C','D','E','F','G','H','I','J','K','L','M',
        'N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
    // conversion methods as described above
}

5. Vesta Framework Introduction

Vesta is a generic ID generator (often called a unified ID allocator). It offers global uniqueness, approximate ordering, reversibility, and manufacturability. Vesta supports three deployment modes: embedded, central server, and REST. It can produce peak‑rate or fine‑granularity IDs and is designed for high performance, high availability, and scalability.

For detailed design and usage, refer to the repositories:

Gitee: https://gitee.com/robertleepeak/vesta-id-generator

GitHub: https://github.com/cloudatee/vesta-id-generator

Further articles will dive into Vesta's architecture.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavasnowflakeID generationbase62
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.