Fundamentals 8 min read

Why ULID Beats UUID: A Deep Dive into Features, Specs, and Python Usage

This article compares UUID and ULID, explains the limitations of UUID versions, details ULID's timestamp‑based, lexicographically sortable design, presents its binary layout and encoding, and shows how to generate and manipulate ULIDs in Python with concrete code examples.

Architect

May 1, 2024

Why ULID Beats UUID: A Deep Dive into Features, Specs, and Python Usage

Why Not Choose UUID

UUID defines five versions. Version 1 requires a stable MAC address, which is often unavailable and exposes the identifier to spoofing. Version 2 replaces the first four timestamp bits with a POSIX UID/GID, inheriting the same MAC‑address dependency. Version 3 uses MD5 hashing; generating a uniformly distributed ID needs a unique seed, which can cause data‑structure fragmentation. Version 4 is purely random and provides no additional information. Version 5 uses SHA‑1 hashing and suffers the same fragmentation risk as version 3. Although version 4 (UUID 4) is the most common, purely random IDs still carry a non‑zero collision probability.

ULID Advantages

ULID combines a millisecond‑precision timestamp with 80 bits of randomness, yielding 1.21×10⁺24 unique IDs per millisecond. This effectively eliminates collision risk while embedding creation time, enabling time‑based sharding and ordering without a separate created_at column. The string representation uses 26 Crockford‑Base32 characters, shorter and more readable than the 36‑character UUID.

ULID Features

128‑bit size, compatible with UUID storage.

1.21×10⁺24 unique IDs per millisecond.

Lexicographically sortable (dictionary order) when represented as a string.

Encoded as 26 URL‑safe, case‑insensitive characters (Crockford’s Base32).

Monotonic ordering for IDs generated within the same millisecond.

ULID Specification

In the Python library ulid-py, a ULID consists of a 48‑bit timestamp (UNIX time in milliseconds) and an 80‑bit random component. The timestamp is valid until the year 10889, guaranteeing ample address space.

Components

Timestamp

48‑bit integer representing UNIX time in milliseconds.

Valid up to year 10889, so the identifier space will not be exhausted.

Randomness

80‑bit random number.

Cryptographically secure generation (e.g., os.urandom()) is recommended.

Sorting

The leftmost characters encode the most significant bits, ensuring lexical order matches chronological order. Within a single millisecond, ordering is not guaranteed unless a monotonic ULID generator is used.

Encoding

ULID uses Crockford’s Base32 alphabet (0123456789ABCDEFGHJKMNPQRSTVWXYZ), which omits I, L, O, and U to avoid visual confusion.

0123456789ABCDEFGHJKMNPQRSTVWXYZ

Binary Layout

ULID is encoded as 16 octets (bytes) in network byte order (big‑endian):

0               1               2               3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          32_bit_uint_time_high                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|   16_bit_uint_time_low   |   16_bit_uint_random   |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               32_bit_uint_random                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|               32_bit_uint_random                       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Application Scenarios

Replace auto‑increment primary keys in databases, removing the need for DB‑side ID generation.

In distributed systems, substitute UUID with ULID for globally unique, millisecond‑ordered identifiers.

Use the embedded timestamp for time‑based sharding or partitioning of tables.

If millisecond precision is acceptable, sort records directly by ULID instead of a separate created_at column.

Python Usage

Install the library: pip install ulid-py Create a new ULID:

import ulid
ulid_obj = ulid.new()
print(ulid_obj)  # e.g., ULID('01BJQE4QTHMFP0S5J153XCFSP9')

Convert an existing UUID to ULID:

import ulid, uuid
value = uuid.uuid4()
ulid_obj = ulid.from_uuid(value)
print(ulid_obj)

Create a ULID from a specific datetime:

import datetime, ulid
ulid_obj = ulid.from_timestamp(datetime.datetime(1999, 1, 1))
print(ulid_obj)

Generate a ULID from custom randomness:

import os, ulid
randomness = os.urandom(10)
ulid_obj = ulid.from_randomness(randomness)
print(ulid_obj)

Access components of a ULID object:

u = ulid.new()
print(u.timestamp())   # Timestamp part
print(u.randomness())  # Randomness part

Reference implementation repository: https://github.com/ahawker/ulid

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Python uuid Sorting ULID identifier uniqueness

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.