How to Build Scalable, Stateless Architecture Without Magic
This article explains practical principles for designing scalable, stateless backend systems—choosing the right tools, using multiple servers, caching, rate limiting, dividing responsibilities, handling large data volumes, and providing concrete example architectures for projects of any size.
This article presents several principles for achieving scalable architecture: use the right tools, keep configurations stateless, and avoid letting the backend perform database work, which is always slower.
Scalability is often seen as a mysterious problem solved only by expensive, specialized tools, but it is simply ordinary code written in ordinary languages.
First, select the appropriate tool for the job. Benchmarks show that some languages excel in certain areas; some databases are read‑optimized, others write‑optimized. Even with the right stack, a single server is insufficient, and understanding how to build scalable setups on bare metal is valuable.
Basic Principles
Choose the Right Tool
Different programming languages suit different tasks.
For example, Python offers rich syntax sugar and expressive code but runs on an interpreter, making it slower than compiled Go or C on bare metal.
NodeJS has the richest ecosystem of external tools but is single‑threaded; to run on multi‑core machines you need a manager like PM2, which requires stateless code.
http://pm2.keymetrics.io/?fileGuid=gr8wsimng4sTPe0C
Databases are similar. SQL provides Turing‑complete query capabilities but lacks caching, so it is usually slower than NoSQL.
Databases are often read‑oriented or write‑oriented. Write‑heavy workloads benefit from write‑optimized databases like Cassandra, while read‑heavy workloads like news feeds are best served by MongoDB. If both are needed, run two databases side by side.
Multiple Servers
When one machine is insufficient, add another; add a third when two are insufficient, and so on.
Scaling from 1 to 2 servers is much harder than from 2 to 3 or 10 to 20.
To use multiple machines, the backend must be stateless, storing all data in a database and keeping no state in the server itself. This is why functional languages are popular for backends.
All servers should behave identically; stateful servers can return different responses for the same input, which is undesirable.
Implement statelessness early; if using NodeJS with PM2, the code must remain stateless for load balancing.
Load balancers route requests to the least‑busy server, so identical responses are essential. For NodeJS, PM2 is a good option; otherwise use Nginx.
Store sessions in Redis so all servers can access them.
Cache and Rate Limiting
Imagine performing the same computation every 100 ms per user; this can cause a Slashdot effect, essentially a DDOS.
Introduce a caching layer so only the first request triggers a DB query; subsequent users receive data from RAM.
Caches expire by default; configure TTL to refresh data.
Never cache user input—only cache server output.
Varnish is a solid HTTP response cache for any backend.
https://varnish-cache.org/?fileGuid=gr8wsimng4sTPe0C
Even with caching, rapid requests can overload the server, so a rate limiter is needed to reject requests that arrive too quickly.
Divide Responsibilities
If using an SQL database, let the DB handle foreign‑key calculations instead of the backend.
Backend responsibilities should include hashing, rendering pages from data and templates, and session management.
Move data‑model logic into stored procedures or queries.
Large Data Volumes
Even with clustered databases, capacity is limited by server hardware; unlimited growth requires distributed databases that store data across many servers.
Master‑slave replication can double DB capacity and provide load balancing, but it does not give unlimited growth.
Potential Bottlenecks
Single‑threaded, stateful servers – code must be stateless for load balancing.
Backend performing database work – move data work to the DB.
Single DB instance – use clustering for load balancing.
Confusing read‑ vs write‑optimized databases – choose based on workload.
Clients far from servers – use a CDN.
Example Setups
Kitten
This is a basic LAMP stack built in one night; it is stateful, storing sessions in memory, and therefore not scalable, but suitable for small weekend projects.
Data: GB‑scale
Users: thousands
Bottleneck: availability – single server vulnerable to Slashdot effect
Tools: conventional LAMP stack
Cat
Added caching improves speed, but the architecture remains stateful and unscalable. When user count grows, this setup should be upgraded.
Data: GB‑scale
Users: tens of thousands
Bottleneck: stateful server despite cache
Tools: MongoDB, Express as rate limiter and in‑memory cache
Cheetah
This setup is scalable: you can add unlimited servers, handling all requests that would crash the "Cat" setup, though the database remains a single instance and may become a bottleneck.
Data: TB‑scale
Users: hundreds of thousands
Bottleneck: single DB instance
Tools: Go, Redis cache, MongoDB
Tiger
This architecture is fast and scalable, but distant users may experience high latency due to geographic distance.
Data: hundreds of TB
Users: millions
Bottleneck: geographic distance
Tools: Go, Redis + Cassandra + MongoDB
Lion
This is a CDN‑based setup with servers worldwide acting as primary nodes; it still relies on a single database, limiting capacity.
Data: hundreds of TB
Users: tens of millions
Bottleneck: large data volume – master‑slave replication limits capacity
Tools: same as above, MongoDB in a cluster
Gear Tiger
This ultimate form uses a graph database like Riak, removing storage limits and suitable for Google‑ or Facebook‑scale applications.
Data: unlimited
Users: global
Bottleneck: price – costs comparable to space projects
Tools: Go, Riak
Conclusion
We reviewed common configurations for projects of various sizes; you don’t have to follow any single setup—design according to your needs, always choosing the right tool for the job.
Ensure scalability by keeping everything stateless!
Original link: https://mvoloskov.hashnode.dev/scalable-architecture-without-magic-and-how-to-build-it-if-youre-not-google?fileGuid=gr8wsimng4sTPe0C
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
