Designing Scalable Stateless Architecture: Sessions, Caching, Sharding & Monitoring

The article explains how to achieve horizontal scalability by making applications stateless, using client‑side cookies for session data, applying various caching layers, splitting services and databases with sharding, adopting asynchronous messaging, storing unstructured data, and integrating monitoring with alerting.

Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Designing Scalable Stateless Architecture: Sessions, Caching, Sharding & Monitoring

1. Stateless Session Design

When a large amount of client state is stored in the server session, a server crash requires cluster‑wide recovery. Traditional session replication (Tomcat broadcast, JBoss pair replication) adds significant overhead and limits horizontal scaling because inter‑node communication grows with node count. Taobao’s stateless session framework stores state in client cookies, keeping each application node identical and enabling easy horizontal scaling.

Cookie size limits (typically 4 KB per cookie, with browsers allowing around 20 cookies per domain) constrain the amount of data that can be stored. Taobao’s “multi‑value cookie” combines multiple key‑value pairs into a single cookie, reducing the number of cookies while preserving useful information.

2. Effective Caching (Tair)

Various cache layers—browser cache, reverse‑proxy cache, page cache, fragment cache, object cache—are used mainly for read‑heavy data with low write‑to‑read ratios and modest consistency requirements. Caching static shop information (e.g., shop description, service terms, product details) reduces database load.

3. Application Splitting (HSF)

Splitting a monolithic system into loosely coupled subsystems based on business relevance improves scalability and maintainability. Each subsystem can be scaled independently, and failures in one subsystem do not affect the whole system. Communication between subsystems can be synchronous or asynchronous, making high‑performance remote‑call frameworks essential.

4. Database Sharding (TDDL)

Beyond application‑level splitting, storage must also be partitioned. Read‑heavy workloads trigger master‑slave replication; when the master becomes a bottleneck, vertical partitioning (separate databases for product, user, transaction data) is applied. Horizontal partitioning (sharding) further distributes large tables (e.g., friend relationships, shop configurations) across multiple servers.

Taobao developed the TDDL framework to abstract sharding and master‑slave management, providing transparent data access across heterogeneous databases.

5. Asynchronous Communication (Notify)

Message middleware enables asynchronous communication, which enhances system scalability and decouples subsystems. Asynchronous patterns suit loosely coupled interactions, while tightly coupled business processes may still require synchronous calls.

6. Unstructured Data Storage (TFS, NoSQL)

Not all data fits relational models. Configuration files, transaction snapshots, and other dynamic data are better stored in key‑value stores. Large static assets (product images, descriptions) are offloaded to distributed file systems to avoid RDBMS performance degradation.

Since 2008, NoSQL solutions (Cassandra, HBase, Google Bigtable) have been adopted for their high horizontal scalability and eventual consistency, aligning with the CAP theorem where internet‑scale services prioritize availability over strict consistency.

7. Monitoring and Alerting

Large distributed systems consist of numerous devices (network switches, PCs, NICs, disks, memory). Monitoring at coarse granularity tracks overall metrics such as network traffic, memory usage, I/O, CPU load, request volume, and response time. Fine‑grained monitoring observes per‑URL traffic, page views, bandwidth consumption, and rendering times.

Integrating alerts with monitoring allows automatic detection of abnormal CPU or memory spikes, sudden traffic surges, or high request loss, enabling rapid response to maintain system stability and availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringshardingcachingstateless
Art of Distributed System Architecture Design
Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.