How to Maximize System Load Capacity: Metrics, Bottlenecks, and Tuning Strategies
This article explains how to measure a system's load capacity, identifies key factors such as bandwidth, hardware, and OS settings, and provides practical optimization techniques for Linux, Nginx, Tomcat, and databases to achieve higher concurrency and better performance.
1. Measurement Metrics
The primary metric for load capacity is Requests per Second (RPS), which counts only successfully responded requests. As concurrent users increase, RPS rises until a tipping point where additional users cause RPS to drop and response time to increase; this point represents the system's maximum load.
2. Related Factors
Key factors influencing concurrency include bandwidth, hardware configuration, system configuration, application server configuration, and program logic. Bandwidth and hardware set the upper bound, while the other factors determine how close to that bound the system can operate.
2.1 Bandwidth
Bandwidth, measured in Mbps, determines the maximum data transmission speed, analogous to the width of a water pipe.
2.2 Hardware Configuration
Critical hardware parameters are CPU frequency/cores, memory size and speed, and disk speed (SSD vs. HDD). Upgrading these components directly raises the load ceiling.
2.3 System Configuration
On Linux, several kernel and limit settings affect load capacity:
File descriptor limits: /proc/sys/fs/file-max and per‑process limits via ulimit -n or /etc/security/limits.conf.
Process/thread limits: ulimit -u for processes, /proc/sys/kernel/threads-max for threads, and PTHREAD_THREADS_MAX for per‑process thread caps.
TCP kernel parameters: tuning net.ipv4.tcp_syncookies, net.ipv4.tcp_tw_reuse, net.ipv4.tcp_tw_recycle, net.ipv4.tcp_fin_timeout, and others in /etc/sysctl.conf to reduce TIME_WAIT buildup and improve connection handling.
2.4 Application Server Configuration
Different concurrency models are used by servers:
Multi‑process (one process per request)
Prefork (process pool)
Worker (one thread per request)
Master/worker (event‑driven, non‑blocking I/O, used by Nginx)
2.4.1 Nginx/Tengine
Key settings include matching worker count to CPU cores, appropriate keep‑alive timeout, increasing worker_rlimit_nofile, and enabling HTTP/1.1 keep‑alive for upstream connections.
2.4.2 Tomcat
Tomcat tuning involves JVM options (heap sizes -Xms, -Xmx, young generation -Xmn, stack size -Xss) and connector parameters such as protocol (prefer apr), connectionTimeout, maxThreads, minSpareThreads, acceptCount, and maxConnection. Large‑memory Tomcat can suffer long GC pauses, while smaller Tomcat instances in a cluster improve scalability and fault tolerance.
2.4.3 Database
MySQL is the common relational database but can become a bottleneck at high load. Strategies include vertical/horizontal sharding, using Redis as a cache layer, and implementing read‑write separation with master/slave replication.
3. Typical Architecture
A common web stack is illustrated as: LVS + Nginx + Tomcat + MySQL + Redis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
