Backend Development 6 min read

Designing a Scalable Backend for a Nationwide ID Query Service

The article outlines a simple yet scalable backend architecture that can handle 20 million daily ID queries by partitioning a billion‑record dataset across multiple 16 GB virtual machines, using direct‑index lookups, modest bandwidth, and basic redundancy mechanisms to achieve ample performance headroom.

Laravel Tech Community

Oct 13, 2022

Designing a Scalable Backend for a Nationwide ID Query Service

Assuming 20 million daily requests spread over one hour, the peak load is under 10 k concurrent connections per second.

With a dataset of one billion records, each requiring 16 bytes (e.g., a compressed 48‑bit ID plus additional fields), the total size is about 16 GB, which fits comfortably in a 32 GB PC or a server with 256 GB–1 TB memory.

By sharding data based on the first 3–6 digits of the ID (regional code), the workload can be distributed across multiple servers; each server stores roughly 100–200 million records, so twenty 16 GB VM instances provide ample capacity.

The system startup involves loading the server‑specific subset of records (200–400 million rows) from the database, after which the service begins handling requests.

Even at the worst‑case scenario of 20 million queries in one hour, the service needs to serve about 5 556 users per second, each returning roughly 2 KB of data, resulting in less than 12 MB/s of traffic.

The query flow is a short TCP connection: client sends the ID, server returns the data, and the connection closes.

There is no need to solve the classic C10k problem; using epoll on Linux or IOCP on Windows, a 100 Mbps link per server is sufficient.

Searching by ID can be O(1) with a direct index; if a direct map is not used, binary search (O(log N)) requires at most ~30 comparisons for a billion entries, which is negligible compared to network latency.

Thus, twenty 16 GB VMs with 100 Mbps outbound bandwidth each can comfortably support the load, leaving significant performance surplus.

With a total of 1 Gbps bandwidth, the system could serve up to 200 million users per hour.

For reliability, additional backup VMs can be managed via Zookeeper; if one node fails, another takes over instantly.

Data updates (e.g., nucleic‑acid test results) can be propagated using database triggers that notify peripheral nodes to refresh their caches.

Overall, the solution relies on straightforward logic and a few thousand lines of C and JavaScript code, demonstrating that a seemingly massive system can be built with modest resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems scalability redundancy

Written by

Laravel Tech Community

Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.