How to Build a High‑Performance Short URL Service: Redirects, Generation Strategies, and Scaling

This article explains the core functions of a short‑URL system, compares permanent (301) and temporary (302) redirects, details four generation methods—including hash, auto‑increment, random strings, and pre‑generation—and outlines a high‑concurrency architecture with sharding, caching, and Snowflake IDs.

Senior Tony
Senior Tony
Senior Tony
How to Build a High‑Performance Short URL Service: Redirects, Generation Strategies, and Scaling

Core Functions of a Short URL System

A short‑URL service converts long links into compact strings to improve readability and shareability in notifications, ads, and social media. It provides two main capabilities: redirecting short links to their original URLs and generating short codes for new long URLs.

Redirect Implementation

Redirects can be performed using HTTP 301 (permanent) or HTTP 302 (temporary). Browsers cache the target of a 301 redirect, eliminating subsequent requests to the short‑URL server, which reduces load. A 302 redirect does not get cached, so each request incurs a round‑trip to the server.

StatusCode: 301 Moved Permanently
Location: https://origin.com/long-url?param=value
Redirect flow diagram
Redirect flow diagram
Status Code: 302 Moved Temporarily
Location: https://origin.com/long-url?param=value
301 vs 302 comparison
301 vs 302 comparison

In high‑concurrency scenarios, 301 is preferred to minimize server requests, while 302 is suitable when request counting or temporary redirection is required.

Short URL Generation Methods

The service can generate short codes using four common schemes:

Long‑URL hash : Compute a hash (e.g., MD5, SHA‑1, MurmurHash) of the original URL, convert the hash to a base‑62 string, and take the first six characters (e.g., "3a6bd3"). This yields about 56.8 billion possible codes, sufficient for most use cases. Collisions must be handled by adding randomness.

Auto‑increment sequence : Use a Snowflake algorithm or a database auto‑increment ID to obtain a unique numeric ID, then encode it in base‑62. Snowflake avoids clock‑backward issues but requires a distributed ID service; database IDs are simpler but can suffer from clock drift and added latency.

Random string : Generate a UUID, take the first eight characters, or convert a numeric ID to base‑62. This approach is easy to implement with standard libraries.

Pre‑generation : Create a pool of short codes in advance and store them in a database. When a new short link is needed, assign an unused code from the pool, eliminating on‑the‑fly collisions.

import java.util.UUID;
public class GenerateUUID {
    public static void main(String[] args) {
        // Generate a random UUID
        UUID uuid = UUID.randomUUID();
        // Convert to string
        String uuidString = uuid.toString();
        // Take the first eight characters
        String firstEightChars = uuidString.substring(0, 8);
        // Output the short code
        System.out.println("First eight characters: " + firstEightChars);
    }
}
Hash generation example
Hash generation example

Designing for High Concurrency

Large e‑commerce and short‑video platforms generate billions of short links daily, with peak QPS in the millions. To sustain such load, the architecture includes:

Prefer HTTP 301 redirects to reduce repeat traffic.

Use the Snowflake algorithm to generate up to 4,096,000 unique IDs per node, then encode them in base‑62, eliminating duplicate‑check steps.

Employ a Redis + Caffeine double‑cache: Redis holds hot short links centrally, while Caffeine provides fast in‑process caching.

Shard data by short‑code hash: dbId = shortCodeHash % dbCount and tableId = (shortCodeHash / dbCount) % tableCount, distributing writes across many databases and tables.

Archive cold mapping data (e.g., after three months) to reduce storage pressure.

High‑concurrency architecture diagram
High‑concurrency architecture diagram

By combining these techniques—efficient redirect handling, collision‑free ID generation, layered caching, and sharding—the short‑URL service can achieve high throughput and low latency even under extreme traffic spikes.

system designSnowflakeurl-shorteninghigh-concurrencyshorturl
Senior Tony
Written by

Senior Tony

Former senior tech manager at Meituan, ex‑tech director at New Oriental, with experience at JD.com and Qunar; specializes in Java interview coaching and regularly shares hardcore technical content. Runs a video channel of the same name.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.