Databases 9 min read

Risks of Auto‑Increment IDs and Distributed ID Solutions

The article explains how exposing auto‑increment primary keys can leak business information, illustrates the danger with historical examples, and evaluates alternative ID generation strategies such as encoding, UUIDs, and Snowflake‑style distributed IDs, including performance comparisons in MySQL.

Hujiang Technology

Nov 9, 2016

Risks of Auto‑Increment IDs and Distributed ID Solutions

Inspired by the classic "German tank problem," the author highlights that sequential identifiers—like auto‑increment user IDs, order numbers, or transaction IDs—can unintentionally reveal sensitive business metrics, just as the Allies inferred German tank production from serial numbers.

Real‑world incidents are cited where exposed IDs allowed journalists or competitors to estimate a company's order volume, leading to valuation drops and stock declines, underscoring the security threat of plain sequential keys.

Simply replacing primary keys is not always feasible because they are deeply referenced as foreign keys across many tables and external systems; extensive changes would incur high development costs.

Two practical mitigation approaches are discussed:

1. Encode IDs in API traffic

Instead of exposing raw IDs, the client sends an encoded value (e.g., Base64) which the server decodes before querying the database. This preserves existing primary key structures but reduces readability and adds encoding/decoding overhead.

2. Use Distributed IDs

Many databases (especially Oracle) favor globally unique, trend‑ordered identifiers. The article reviews common distributed ID schemes—unsynchronized auto‑increments, batch services, UUID, Snowflake, and Flickr‑style IDs—focusing on UUID and Snowflake.

UUID

UUIDs provide global uniqueness but are typically stored as varchar(32), binary(16), or bigint(16) in MySQL. Performance tests on a 10‑million‑row InnoDB table show numeric types (including binary‑converted UUIDs) outperform string‑based UUIDs, though readability suffers.

Snowflake‑style IDs

Snowflake generates 64‑bit long IDs that embed a timestamp, machine identifier, and sequence number, yielding roughly 9‑17 decimal digits. These IDs are trend‑ordered, highly efficient (hundreds of thousands per second), and suitable as MySQL primary keys.

The discussion notes that while Snowflake IDs solve uniqueness and ordering, their low‑entropy trailing bits can cause uneven sharding when used for modulo‑based partitioning; adding random bits can mitigate this.

In conclusion, the author recommends carefully assessing the necessity of replacing auto‑increment keys and, when needed, adopting lightweight encoding or distributed ID schemes like binary‑UUID or Snowflake to balance security, performance, and operational cost.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

security Database Design auto_increment UUID Snowflake Distributed ID

Written by

Hujiang Technology

We focus on the real-world challenges developers face, delivering authentic, practical content and a direct platform for technical networking among developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.