How to Build a High‑Performance URL Shortening Service: Architecture & Algorithms
This article explores the design of a high‑performance short‑link system, covering its benefits, redirection mechanics, hash‑based and auto‑increment generation methods, database schema, Bloom filter optimization, and a scalable OpenResty‑based architecture for handling massive traffic.
Introduction
We discuss how to design a high‑performance short‑link system, a topic often used in technical interviews, and share insights from a production service that has been running stably for over two years.
Why Use Short Links?
Short links save characters on platforms with length limits (e.g., Weibo, SMS), reduce costs, improve QR‑code readability, and ensure proper hyperlink detection on services like DingTalk.
Long URLs can be unwieldy and may not be fully recognized as clickable links.
Basic Redirection Principle
When a short URL is accessed, the server returns a 302 temporary redirect with the long URL in the Location header; the browser then follows the redirect to retrieve the final content.
302 is preferred over 301 because it allows the server to count each click, which is important for analytics.
Short‑Link Generation Methods
1. Hash‑Based Algorithm
By applying a non‑cryptographic hash (e.g., MurmurHash) to the long URL, we obtain a numeric value that can be encoded in base‑62 to produce a short string.
MurmurHash offers high speed and low collision probability; using its 32‑bit output (≈4.3 billion values) is sufficient for most services.
Converting the 32‑bit decimal value to base‑62 reduces its length dramatically (e.g., 3002604296 → 3hcCxy).
To handle hash collisions, the mapping between short and long URLs is stored in a database (MySQL example below).
<code>CREATE TABLE `short_url_map` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`lurl` varchar(160) DEFAULT NULL COMMENT '长地址',
`surl` varchar(10) DEFAULT NULL COMMENT '短地址',
`gmt_create` int(11) DEFAULT NULL COMMENT '创建时间',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;</code>Insertion workflow: hash the long URL, check if the short code exists, insert if not, otherwise re‑hash with a custom suffix until a unique code is found.
Bloom filters can pre‑filter existence checks, reducing database load; a 1‑billion‑entry filter consumes only ~125 MB of memory.
2. Auto‑Increment ID Algorithm
An incremental ID generator (e.g., MySQL AUTO_INCREMENT) provides a unique integer that is then encoded in base‑62 to form the short suffix.
Four common ID sources are discussed:
UUID (high entropy but long and unordered, may cause index fragmentation)
Redis (fast, can be sharded, but requires persistence and cost considerations)
Snowflake (time‑based, vulnerable to clock rollback)
MySQL auto‑increment (simple, easy to scale with partitioning)
To avoid a single point of contention, a “pre‑allocation” table reserves ID ranges for each server (e.g., each range contains 1 000 IDs).
When a server exhausts its range, it requests a new block from the allocation table.
For duplicate long URLs, a lookup by MD5 hash (indexed) can prevent multiple short codes for the same target.
<code>CREATE TABLE `short_url_map` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT '短链 id',
`lurl` varchar(10) DEFAULT NULL COMMENT '长链',
`md5` char(32) DEFAULT NULL COMMENT '长链md5',
`gmt_create` int(11) DEFAULT NULL COMMENT '创建时间',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;</code>High‑Performance Architecture
To handle spikes of hundreds of thousands of QPS during events, the service uses OpenResty (Nginx + Lua) which provides non‑blocking I/O, built‑in caching, and direct integration with Redis and MySQL, eliminating an extra application layer.
Conclusion
The article presents several design options for a short‑link service, including hash‑based and auto‑increment approaches, database schema considerations, Bloom filter optimization, and a scalable OpenResty deployment, encouraging readers to explore the mentioned technologies further.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.