Backend Development 11 min read

How to Build a Million‑QPS Short‑URL Service: Architecture & Code

This article walks through the challenges of handling millions of requests per second for a short‑URL service and presents a complete backend solution—including ID generation, Base62 encoding, cache‑layer design, Nginx redirect optimization, disaster‑recovery strategies, sharding, and performance test results—while providing Java code snippets and design principles for high‑throughput, resilient systems.

Su San Talks Tech

Jun 23, 2025

How to Build a Million‑QPS Short‑URL Service: Architecture & Code

Introduction

At 2 am the monitoring screen turned red – the short‑URL service QPS surged past 800 k, database connection pool exhausted, Redis latency over 500 ms. This is a real scenario from a large e‑commerce promotion.

Core Challenges

When millions of requests per second hit a short‑URL service, three critical problems appear:

ID generation bottleneck : traditional auto‑increment IDs cannot keep up.

Redirect performance black‑hole : the cost of TCP connections for 302 redirects.

Cache avalanche risk : hot short URLs can overwhelm Redis.

Short‑URL Generation

Segment ID Generator (Java)

public class SegmentIDGen {
    private final AtomicLong currentId = new AtomicLong(0);
    private volatile long maxId;
    private final ExecutorService loader = Executors.newSingleThreadExecutor();

    public void init() {
        loadSegment();
        loader.submit(this::daemonLoad);
    }

    private void loadSegment() {
        // SELECT max_id FROM alloc WHERE biz_tag='short_url'
        this.maxId = dbMaxId + 10000; // allocate 10 k IDs
        currentId.set(dbMaxId);
    }

    private void daemonLoad() {
        while (currentId.get() > maxId * 0.8) {
            loadSegment(); // async load when 80 % used
        }
    }

    public long nextId() {
        if (currentId.get() >= maxId) throw new BusyException();
        return currentId.incrementAndGet();
    }
}

Key optimizations :

Double‑buffer asynchronous loading to avoid blocking.

Monitor segment usage and adjust step size dynamically.

Instance‑level segment isolation using biz_tag.

Short‑Code Mapping (Base62)

Long IDs are converted to a 62‑character alphabet.

2000000000 = 2×62^4 + 17×62^3 + 35×62^2 + 10×62 + 8 = "Cdz9a"

Java encoder:

public class Base62Encoder {
    private static final String BASE62 = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

    public static String encode(long id) {
        StringBuilder sb = new StringBuilder();
        while (id > 0) {
            sb.append(BASE62.charAt((int)(id % 62)));
            id /= 62;
        }
        return sb.reverse().toString();
    }

    public static void main(String[] args) {
        long id = 1_000_000_000L;
        System.out.println(encode(id)); // BFp3qQ
    }
}

Six‑character codes provide 62⁶ ≈ 5.68 × 10⁸ combinations; eight characters reach 62⁸ ≈ 2.18 × 10¹⁴.

Storage Architecture

Data model separates mapping, cache, and persistence layers.

Cache Layer

Cache‑break protection uses a Bloom filter, distributed lock, double‑check, and exponential back‑off.

public String getLongUrl(String shortCode) {
    // 1. Bloom filter pre‑check
    if (!bloomFilter.mightContain(shortCode)) return null;

    // 2. Redis lookup
    String cacheKey = "url:" + shortCode;
    String longUrl = redis.get(cacheKey);
    if (longUrl != null) return longUrl;

    // 3. Distributed lock
    String lockKey = "lock:" + shortCode;
    if (redis.setnx(lockKey, "1", 10)) {
        try {
            // 4. Second cache check
            longUrl = redis.get(cacheKey);
            if (longUrl != null) return longUrl;

            // 5. DB query
            longUrl = db.queryLongUrl(shortCode);
            if (longUrl != null) redis.setex(cacheKey, 3600, longUrl);
            return longUrl;
        } finally {
            redis.del(lockKey);
        }
    } else {
        // 7. Retry with back‑off
        Thread.sleep(50);
        return getLongUrl(shortCode);
    }
}

Redirect Optimization

Nginx can perform a direct 302 redirect after a Redis hit, bypassing the backend.

server {
    listen 80;
    server_name s.domain.com;

    location ~ ^/([a-zA-Z0-9]{6,8})$ {
        set $short_code $1;
        redis_pass redis_cluster;
        redis_query GET url:$short_code;
        if ($redis_value != "") {
            add_header Cache-Control "private, max-age=86400";
            return 302 $redis_value;
        }
        proxy_pass http://backend;
    }
}

Connection‑pool optimization with Netty reduces TCP handshake overhead.

public class HttpConnectionPool {
    private final EventLoopGroup group = new NioEventLoopGroup();
    private final Bootstrap bootstrap = new Bootstrap();

    public HttpConnectionPool() {
        bootstrap.group(group)
                 .channel(NioSocketChannel.class)
                 .option(ChannelOption.SO_KEEPALIVE, true)
                 .handler(new HttpClientInitializer());
    }

    public Channel getChannel(String host, int port) throws InterruptedException {
        return bootstrap.connect(host, port).sync().channel();
    }
}

Disaster Recovery

Rate‑limit & Circuit‑breaker (Sentinel)

@GetMapping("/{shortCode}")
@SentinelResource(value = "redirectService",
                  fallback = "fallbackRedirect",
                  blockHandler = "blockRedirect")
public ResponseEntity redirect(@PathVariable String shortCode) {
    // redirect logic …
}

public ResponseEntity fallbackRedirect(String shortCode, Throwable ex) {
    return ResponseEntity.status(503).body("Service temporarily unavailable");
}

public ResponseEntity blockRedirect(String shortCode, BlockException ex) {
    return ResponseEntity.status(429).body("Too many requests");
}

Sharding Strategy

public int determineDbShard(String shortCode) {
    int ascii = (int) shortCode.charAt(0);
    return ascii % 16; // 16 databases
}

public int determineTableShard(String shortCode) {
    CRC32 crc32 = new CRC32();
    crc32.update(shortCode.getBytes());
    return (int) (crc32.getValue() % 1024); // 1024 tables per DB
}

Performance Test Results

Optimization          Before QPS   After QPS   Scale
Original               12,000        -           1x
+Redis cache           120,000       10x
+Nginx direct jump     350,000       2.9x
+Connection pool       780,000       2.2x
+Bloom filter          1,200,000     1.5x

Test environment: 10 × 32‑core 64 GB servers, 1 Gbps network.

Conclusion

The architecture hinges on four principles:

Stateless design : redirect service holds no state, enabling unlimited scaling.

Read‑heavy optimization : push read performance to the limit.

Divide‑and‑conquer : data sharding spreads traffic.

Graceful degradation : prefer partial degradation over total outage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java system architecture high concurrency short-url

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.