How to Build Scalable, Stateless Architecture Without Magic

This article explains practical principles for designing scalable, stateless backend systems—choosing the right tools, using multiple servers, caching, rate limiting, dividing responsibilities, handling large data volumes, and providing concrete example architectures for projects of any size.

Java High-Performance Architecture
Java High-Performance Architecture
Java High-Performance Architecture
How to Build Scalable, Stateless Architecture Without Magic
This article presents several principles for achieving scalable architecture: use the right tools, keep configurations stateless, and avoid letting the backend perform database work, which is always slower.

Scalability is often seen as a mysterious problem solved only by expensive, specialized tools, but it is simply ordinary code written in ordinary languages.

First, select the appropriate tool for the job. Benchmarks show that some languages excel in certain areas; some databases are read‑optimized, others write‑optimized. Even with the right stack, a single server is insufficient, and understanding how to build scalable setups on bare metal is valuable.

Basic Principles

Choose the Right Tool

Different programming languages suit different tasks.

For example, Python offers rich syntax sugar and expressive code but runs on an interpreter, making it slower than compiled Go or C on bare metal.

NodeJS has the richest ecosystem of external tools but is single‑threaded; to run on multi‑core machines you need a manager like PM2, which requires stateless code.

http://pm2.keymetrics.io/?fileGuid=gr8wsimng4sTPe0C

Databases are similar. SQL provides Turing‑complete query capabilities but lacks caching, so it is usually slower than NoSQL.

Databases are often read‑oriented or write‑oriented. Write‑heavy workloads benefit from write‑optimized databases like Cassandra, while read‑heavy workloads like news feeds are best served by MongoDB. If both are needed, run two databases side by side.

Multiple Servers

When one machine is insufficient, add another; add a third when two are insufficient, and so on.

Scaling from 1 to 2 servers is much harder than from 2 to 3 or 10 to 20.

To use multiple machines, the backend must be stateless, storing all data in a database and keeping no state in the server itself. This is why functional languages are popular for backends.

All servers should behave identically; stateful servers can return different responses for the same input, which is undesirable.

Implement statelessness early; if using NodeJS with PM2, the code must remain stateless for load balancing.

Load balancers route requests to the least‑busy server, so identical responses are essential. For NodeJS, PM2 is a good option; otherwise use Nginx.

Store sessions in Redis so all servers can access them.

Cache and Rate Limiting

Imagine performing the same computation every 100 ms per user; this can cause a Slashdot effect, essentially a DDOS.

Introduce a caching layer so only the first request triggers a DB query; subsequent users receive data from RAM.

Caches expire by default; configure TTL to refresh data.

Never cache user input—only cache server output.

Varnish is a solid HTTP response cache for any backend.

https://varnish-cache.org/?fileGuid=gr8wsimng4sTPe0C

Even with caching, rapid requests can overload the server, so a rate limiter is needed to reject requests that arrive too quickly.

Divide Responsibilities

If using an SQL database, let the DB handle foreign‑key calculations instead of the backend.

Backend responsibilities should include hashing, rendering pages from data and templates, and session management.

Move data‑model logic into stored procedures or queries.

Large Data Volumes

Even with clustered databases, capacity is limited by server hardware; unlimited growth requires distributed databases that store data across many servers.

Master‑slave replication can double DB capacity and provide load balancing, but it does not give unlimited growth.

Potential Bottlenecks

Single‑threaded, stateful servers – code must be stateless for load balancing.

Backend performing database work – move data work to the DB.

Single DB instance – use clustering for load balancing.

Confusing read‑ vs write‑optimized databases – choose based on workload.

Clients far from servers – use a CDN.

Example Setups

Kitten

This is a basic LAMP stack built in one night; it is stateful, storing sessions in memory, and therefore not scalable, but suitable for small weekend projects.

Data: GB‑scale

Users: thousands

Bottleneck: availability – single server vulnerable to Slashdot effect

Tools: conventional LAMP stack

Cat

Added caching improves speed, but the architecture remains stateful and unscalable. When user count grows, this setup should be upgraded.

Data: GB‑scale

Users: tens of thousands

Bottleneck: stateful server despite cache

Tools: MongoDB, Express as rate limiter and in‑memory cache

Cheetah

This setup is scalable: you can add unlimited servers, handling all requests that would crash the "Cat" setup, though the database remains a single instance and may become a bottleneck.

Data: TB‑scale

Users: hundreds of thousands

Bottleneck: single DB instance

Tools: Go, Redis cache, MongoDB

Tiger

This architecture is fast and scalable, but distant users may experience high latency due to geographic distance.

Data: hundreds of TB

Users: millions

Bottleneck: geographic distance

Tools: Go, Redis + Cassandra + MongoDB

Lion

This is a CDN‑based setup with servers worldwide acting as primary nodes; it still relies on a single database, limiting capacity.

Data: hundreds of TB

Users: tens of millions

Bottleneck: large data volume – master‑slave replication limits capacity

Tools: same as above, MongoDB in a cluster

Gear Tiger

This ultimate form uses a graph database like Riak, removing storage limits and suitable for Google‑ or Facebook‑scale applications.

Data: unlimited

Users: global

Bottleneck: price – costs comparable to space projects

Tools: Go, Riak

Conclusion

We reviewed common configurations for projects of various sizes; you don’t have to follow any single setup—design according to your needs, always choosing the right tool for the job.

Ensure scalability by keeping everything stateless!

Original link: https://mvoloskov.hashnode.dev/scalable-architecture-without-magic-and-how-to-build-it-if-youre-not-google?fileGuid=gr8wsimng4sTPe0C
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ScalabilityDatabase designstateless
Java High-Performance Architecture
Written by

Java High-Performance Architecture

Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.