Designing a Scalable Backend for Nationwide Health Data Queries
The article outlines a simple, cost‑effective backend architecture for handling up to 20 million daily health‑status queries across a billion users, detailing data storage, sharding by ID, memory requirements, load handling, and redundancy strategies, while noting practical limitations and promotional notes.
The author proposes a backend system to serve nationwide health data queries (e.g., COVID test results) for up to 20 million requests per hour, focusing on simplicity and cost efficiency.
Data size is estimated at one billion records, each requiring 16 bytes, which fits into roughly 16 GB of memory; modern servers with 256 GB to 1 TB RAM are commonplace.
Sharding is achieved by partitioning records based on the first 3–6 digits of the ID (regional code), distributing the load across multiple servers so that each node stores about 100–200 million records, allowing 20 virtual machines with 16 GB RAM to handle the workload.
System startup involves loading the local shard from the database (2–4 hundred million rows) and then beginning to serve requests; the per‑second load is about 5,556 requests, each returning roughly 2 KB, resulting in less than 12 MB/s of outbound traffic.
The network protocol is a short‑link request: TCP handshake, send ID, receive data, and close connection. A 100 Mbps outbound link per VM is sufficient, eliminating the need for complex C10k solutions.
Lookup performance can be O(1) with a direct ID‑to‑offset array; even a binary search (O(log N), about 30 steps for a billion entries) is negligible compared to network latency.
Redundancy is handled by additional backup VMs coordinated via Zookeeper; if a node fails, another takes over instantly.
Data updates are managed with database triggers that notify cache nodes of changes; a latency of up to one hour is acceptable for test‑result data.
The author notes real‑world concerns such as server stability, code quality, and client expectations, and includes a promotional note for a JetBrains license.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
