Fundamentals 5 min read

Understanding the Snowflake Algorithm for Distributed Unique ID Generation

This article explains the background of migrating from MySQL to TiDB, introduces the Snowflake algorithm’s 64‑bit ID structure, discusses its advantages and disadvantages, provides a Python implementation, and highlights its impact on achieving globally unique, time‑ordered identifiers in distributed systems.

360 Quality & Efficiency
360 Quality & Efficiency
360 Quality & Efficiency
Understanding the Snowflake Algorithm for Distributed Unique ID Generation

Background: In previous projects the team used MySQL for its small size and speed, but as data volume grew MySQL could no longer meet the requirements, prompting a migration to the TiDB distributed database. Because TiDB does not handle auto‑increment primary keys well, the team chose the Snowflake algorithm for ID generation.

Algorithm Principle: Snowflake, designed by Twitter, generates 64‑bit unique IDs suitable for distributed systems, supporting tens of thousands of IDs per second with roughly increasing order.

ID Structure: The 64‑bit ID consists of a sign bit (unused), 41 bits for a millisecond‑level timestamp (allowing 69 years of range), 5 bits for datacenter ID, 5 bits for worker ID, and 12 bits for a sequence number within the same millisecond. The total length fits into a Java Long type, and the bit allocation can be customized as needed.

Advantages: 1. Trend‑increasing: timestamps occupy high bits, sequence occupies low bits. 2. High performance with no single point of failure: IDs are generated locally without reliance on external services. 3. Flexible: the bit lengths of the three components can be adjusted to fit different scenarios.

Disadvantages: 1. IDs are not strictly continuous. 2. Generation rules (e.g., starting sequence) cannot be fully controlled. 3. Strong dependence on the machine clock; clock rollback can cause duplicate IDs or service disruption.

Implementation Code (Python):

#coding: utf-8
import datetime
# 起始时间, 不能改变, 2020-04-10
twepoch = 1586448000000
datacenter_id_bits = 5
worker_id_bits = 15
sequence_id_bits = 2
max_datacenter_id = 1 << datacenter_id_bits
max_worker_id = 1 << worker_id_bits
max_sequence_id = 1 << sequence_id_bits
max_timestamp = 1 << (64 - datacenter_id_bits - worker_id_bits - sequence_id_bits)

def make_snowflake(timestamp_ms, datacenter_id, worker_id, sequence_id, twepoch=twepoch):
    """generate a twitter-snowflake id, based on
    :param timestamp_ms: time since UNIX epoch in milliseconds
    :param datacenter_id: exec ip
    :param worker_id: process id, max is 32767, min is 0
    :param sequence_id: thread id, max is 3, min is 0
    :param twepoch: start time stamp
    :return:
    """
    sid = ((int(timestamp_ms) - twepoch) % max_timestamp) << datacenter_id_bits << worker_id_bits << sequence_id_bits
    sid += (datacenter_id % max_datacenter_id) << worker_id_bits << sequence_id_bits
    sid += (worker_id % max_worker_id) << sequence_id_bits
    sid += sequence_id % max_sequence_id
    return sid

Effect: After adopting the Snowflake algorithm, the service’s IDs remain time‑ordered and globally unique, simplifying data migration and scaling in distributed deployments.

Conclusion: Snowflake is a widely used algorithm for generating globally unique IDs in distributed systems. Compared with UUIDs, it is simpler, more space‑efficient, and ordered, but developers must handle clock rollback and clock skew issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PythonTiDBdatabase migrationsnowflakedistributed-id
360 Quality & Efficiency
Written by

360 Quality & Efficiency

360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.