Backend Development 11 min read

Design and Implementation of a High‑Performance URL Shortening Platform

This article details the architecture, core algorithms, security measures, and performance optimizations of a URL shortener platform, covering hash functions, distributed ID generation, Base62 encoding, caching, database indexing, sharding, and monitoring to achieve efficient and secure link redirection.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Design and Implementation of a High‑Performance URL Shortening Platform

1 Background

Zhuanzhuan is a leading second‑hand trading platform in China, where links are essential for user interaction and information exchange.

2 Working Principle

2.1 Short‑Link Generation and Storage

When a long URL is received, the platform first checks for an existing mapping using an MD5 hash; if none exists, it generates a unique ID via a segment‑allocation mode, encodes it with Base62, and persists the mapping for later lookup.

2.2 Short‑Link Return and Distribution

The generated short link is returned to the business side, which can embed it in webpages, SMS, or social media for user access.

2.3 User Click and Redirection

Upon a user click, the platform looks up the short link, retrieves the original long URL, and redirects the user, requiring fast data retrieval and redirection mechanisms.

HTTP 301 (permanent) redirects may be cached by browsers, causing inaccurate click statistics, while 302 (temporary) redirects always hit the short‑link service, increasing load.

3 Core Algorithms

3.1 Hash Algorithms

3.1.1 MD5

MD5 produces a 128‑bit hash used as a basic fingerprint for long URLs.

3.1.2 SHA‑256

SHA‑256 offers stronger security but yields longer hashes, affecting short‑link length.

3.2 Distributed ID

To avoid hash collisions and control link length, unique identifiers are generated.

3.2.1 Global Auto‑Increment

Auto‑increment IDs (e.g., MySQL primary key or Redis INCR) provide a simple, efficient way to generate unique IDs.

3.2.2 Segment Mode

Each node receives a range of IDs; the node increments locally until the segment is exhausted, then requests a new segment, ensuring global uniqueness.

3.2.3 SnowFlake

SnowFlake splits a 64‑bit integer into timestamp, machine ID, data‑center ID, and sequence number, guaranteeing unique, ordered IDs, though it is vulnerable to clock rollback.

3.3 Base62 Encoding

Base62 uses 62 characters (0‑9, a‑z, A‑Z) to produce compact, readable strings; a 6‑character Base62 string can represent about 568 billion values.

import java.util.ArrayList;
import java.util.List;

public class Base62Encoder {
    private static final String BASE62_CHARACTERS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";

    public static String encode(long num) {
        StringBuilder sb = new StringBuilder();
        do {
            int remainder = (int) (num % 62);
            sb.insert(0, BASE62_CHARACTERS.charAt(remainder));
            num /= 62;
        } while (num != 0);
        return sb.toString();
    }
}

4 Security and Protection

4.1 Long‑Link Legitimacy Validation

Before shortening, the platform validates the original URL’s domain against a whitelist and checks query‑parameter domains to prevent malicious links.

4.2 Duplicate Short‑Link Prevention

By using the MD5 of the long URL in an idempotent design, repeated requests produce the same short link, avoiding waste and confusion.

4.3 Short‑Link Validity Verification

The service quickly checks the database to confirm whether a short link exists; if not, it returns an error response.

5 System Performance Optimization

5.1 Database Indexing

The unique ID serves as the primary key, while the MD5 hash is indexed to accelerate validity checks and redirection.

5.2 Cache Utilization

Redis is used as a distributed cache to store short‑link mappings, reducing database load and improving response time under high concurrency.

5.3 Segment‑Mode Optimization

A monitoring thread pre‑allocates new ID segments when usage crosses a threshold, preventing bottlenecks during peak traffic.

5.4 Table Sharding

Link records are sharded into 64 tables based on ID modulo 64, distributing load and enhancing scalability.

5.5 Business Monitoring

Prometheus collects metrics such as request rates for short‑link generation and retrieval, as well as security‑check statistics, providing real‑time insight for operations.

6 Conclusion

Through extensive research and practice, Zhuanzhuan’s short‑link platform delivers efficient and secure link services, and will continue to innovate to meet evolving user needs.

performance optimizationBackend Developmentcachingdistributed IDURL Shorteningbase62
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.