5 Essential Architecture Principles and Static‑Dynamic Separation for High‑Traffic Systems
This article outlines five key architecture principles, explains how to separate static and dynamic data, discusses hotspot data handling, traffic shaping, high‑availability design, and cache‑related issues, providing practical guidance for building resilient, high‑performance systems under heavy load.
Five Architecture Principles
Minimize Data
Minimize Requests
Short Paths
Minimize Dependencies
High Availability
The first principle advises that the amount of data exchanged with the user should be as small as possible, both for uploads and responses. The second principle stresses reducing the number of auxiliary requests such as CSS, JavaScript, images, and Ajax calls. The third principle recommends keeping the request path short, i.e., minimizing intermediate nodes. The fourth principle calls for reducing strong dependencies on other services. The fifth principle emphasizes eliminating single points of failure to achieve high availability.
Static‑Dynamic Separation
Static data (e.g., article content that does not change per user) can be cached close to the user, while dynamic data (personalized content) must be generated per request. Caching static data can be done in the browser, CDN, or server‑side cache.
Static‑dynamic transformation also includes caching the entire HTTP response at the proxy level, allowing the proxy to return the cached response without re‑parsing the HTTP request.
Choosing the appropriate cache layer (browser, CDN, or server) depends on the technology stack; for Java services, placing the cache at the web server (Nginx, Apache, Varnish) often yields better performance.
Hotspot Data in Flash‑Sale Systems
Hotspots are divided into hotspot operations (massive reads/writes such as page refreshes or order submissions) and hotspot data (the data behind those operations). Hotspot data can be static (predictable, identified via pre‑sale registration or data analysis) or dynamic (unpredictable, e.g., sudden viral popularity).
Optimizing hotspot data primarily relies on caching. Static hotspot data can be long‑term cached, while dynamic hotspot data is usually cached temporarily using short‑lived queues with LRU eviction.
Additional techniques include limiting (hash‑based sharding of product IDs), isolating hotspot traffic (business, system, and data isolation), and using queues to smooth bursts.
Traffic Shaping and Peak‑Cutting
Similar to city traffic management, peak‑cutting spreads request bursts over time, reducing server load. Common approaches include message queues, thread‑pool locking, FIFO/LIFO memory queues, and persisting requests to files for later processing.
Performance Optimizations
Reduce Encoding
Reduce Serialization
Java‑Specific Tuning
Concurrent Read Optimization
Inventory Reduction Logic
Three common inventory‑deduction methods are:
Deduct on Order – immediate reduction via DB transaction (precise but may lock inventory for unpaid orders).
Deduct on Payment – reduction only after payment (prevents over‑selling but can cause payment failures under high concurrency).
Pre‑hold – reserve inventory for a short period (e.g., 10 minutes) before payment, releasing if payment does not occur.
High‑Availability Construction
High availability spans the entire system lifecycle: architecture (multi‑datacenter deployment), coding (robust error handling, timeout settings), testing (comprehensive test cases), release (rollback mechanisms), operation (real‑time monitoring), and fault handling (quick isolation and recovery).
Degradation and Rate Limiting
When capacity is reached, non‑core features can be degraded to preserve core services. Rate limiting can be applied on the client side (reducing request generation) or server side (protecting resources). Common algorithms include fixed‑window counters, sliding windows, leaky bucket, token bucket, and outright request rejection when thresholds are exceeded.
Cache Issues and Solutions
Cache avalanche occurs when many cached items expire simultaneously, flooding the database. Preventive measures include high‑availability cache clusters, cache‑level degradation (e.g., Hystrix), and rapid cache warm‑up.
Cache breakdown happens when hot keys expire and many threads query the database simultaneously; solutions involve keeping hot keys alive or using mutexes.
Cache penetration (queries for non‑existent data) can be mitigated with Bloom filters or caching null values for a short TTL.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.