Understanding System Load Capacity and Optimization Techniques for High-Concurrency Backend Applications
This article explains how to measure system load capacity using requests per second, identifies key factors such as bandwidth, hardware, Linux limits, and application server configurations, and provides practical optimization tips for Nginx, Tomcat, MySQL, Redis, and load‑balancing architectures.
In the Internet era, high concurrency is a common concern; the ability to handle peak requests is a key performance metric for web sites and apps.
1. Measurement Metrics
The primary metric is Requests per Second (RPS), representing the number of successful responses per second. As concurrent users increase, RPS rises until a saturation point where further concurrency reduces RPS and increases latency; this point indicates the system’s maximum load capacity.
2. Influencing Factors
Key factors include bandwidth, hardware configuration, system configuration, application server settings, program logic, and overall architecture. Bandwidth and hardware are decisive; the article focuses on maximizing load within given bandwidth and hardware.
2.1 Bandwidth
Bandwidth (Mbps) determines data transmission speed, analogous to pipe size.
2.2 Hardware Configuration
CPU frequency/cores, memory size and speed, and disk speed (SSD vs HDD) directly affect load capacity.
2.3 System Configuration (Linux)
File descriptor limits, process/thread limits, and TCP kernel parameters are critical. Example commands:
echo 1000000 > /proc/sys/fs/file-max fs.file-max = 1000000Process limits can be viewed/changed with ulimit -n and ulimit -u . TCP state counts can be inspected with:
netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'Important TCP tuning parameters include:
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_fin_timeout = 30
2.4 Application Server Configuration
Common models: multi‑process, prefork, worker, and master/worker (event‑driven). Nginx/Tengine uses the master/worker model, suitable for I/O‑bound workloads, while Apache/Tomcat use process or thread models for CPU‑bound tasks.
2.4.1 Nginx/Tengine
Key settings: match worker count to CPU cores, appropriate keepalive timeout, increase worker_rlimit_nofile , enable HTTP/1.1 keepalive upstream.
2.4.2 Tomcat
JVM options (Xms, Xmx, Xmn, -Xss) and connector settings (protocol, connectionTimeout, maxThreads, minSpareThreads, acceptCount, maxConnection) must be tuned. Over‑tuning threads can waste CPU and memory.
2.5 System Architecture
Typical Java backend stack: LVS + Nginx + Tomcat + MySQL/DBCluster + Redis/Codis, with load balancing, synchronous/asynchronous processing, and the “28 principle” (20% of features generate 80% of traffic).
2.5.1 Load Balancing
Hardware (e.g., F5) vs software solutions. Software includes layer‑4 (LVS) and layer‑7 (Nginx) balancers; LVS modes: NAT, DR, IP‑TUNNEL.
2.5.2 Sync vs Async
Choosing synchronous or asynchronous processing can improve response time for non‑critical operations.
2.5.3 28 Principle
Design effort should focus on the 20% of features that handle the majority of load.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.