Mastering System Load: Metrics, Bottlenecks, and Optimization Strategies

This article explains how to measure a system's load capacity, identifies key factors such as bandwidth, hardware, OS and application configurations, and provides practical optimization techniques for Linux, Tomcat, Nginx, MySQL, and Redis to handle high concurrency efficiently.

21CTO
21CTO
21CTO
Mastering System Load: Metrics, Bottlenecks, and Optimization Strategies

In the Internet era, high concurrency is a classic challenge; the number of requests a web site or app can handle at peak is a key performance indicator, as demonstrated by Alibaba’s Double‑11 traffic.

1. Measurement Metrics

The common metric is Requests per Second (RPS), which counts only successfully responded requests. As concurrent users increase, RPS rises until the server becomes saturated; beyond that point RPS drops and response time grows, marking the system’s maximum load capacity.

2. Related Factors

Bandwidth

Hardware configuration

System configuration

Application server configuration

Program logic

System architecture

2.1 Bandwidth

Bandwidth (Mbps) determines the data transmission speed and directly limits the system’s load capacity, similar to the diameter of a water pipe.

2.2 Hardware Configuration

Key hardware components include:

CPU frequency and core count – affect processing speed and thread scheduling.

Memory size and speed – larger and faster memory improves data handling.

Disk speed – SSDs provide much faster I/O than traditional HDDs.

2.3 System Configuration

Linux‑based back‑ends typically consider:

File descriptor limits – total system limit (/proc/sys/fs/file‑max) and per‑process limit (ulimit -n or /etc/security/limits.conf).

Process/thread limits – max processes (ulimit -u, noproc) and max threads (kernel.threads‑max, PTHREAD_THREADS_MAX).

TCP kernel parameters – tuning /etc/sysctl.conf to improve connection handling.

echo 1000000 > /proc/sys/fs/file-max # temporary fs.file-max = 1000000 # permanent in /etc/sysctl.conf

When adjusting file descriptor limits, ensure:

Total open descriptors ≤ /proc/sys/fs/file‑max.

Per‑process soft limit ≤ nofile soft limit.

Soft limit ≤ hard limit, and hard limit ≤ /proc/sys/fs/nr_open.

2.3.2 Process/Thread Limits

Maximum processes per user: ulimit -u or noproc in limits.conf.

Maximum threads: view with cat /proc/sys/kernel/threads-max; per‑process limit tied to PTHREAD_THREADS_MAX.

Thread stack size influences how many threads can be created ( ulimit -s).

2.3.3 TCP Kernel Parameters

Typical tuning includes: net.ipv4.tcp_syncookies = 1 – enable SYN cookies. net.ipv4.tcp_tw_reuse = 1 – allow reuse of TIME‑WAIT sockets. net.ipv4.tcp_tw_recycle = 1 – fast recycle of TIME‑WAIT (caution with NAT/LVS). net.ipv4.tcp_fin_timeout = 30 – shorten FIN timeout.

Additional adjustments for port range and connection limits:

net.ipv4.tcp_keepalive_time = 1200
net.ipv4.ip_local_port_range = 10000 65000
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 5000

2.4 Application Server Configuration

Common concurrency models:

Multi‑process – one process per request.

Prefork – pre‑forked process pool.

Worker – one thread per request.

Master/Worker – non‑blocking I/O with event‑driven workers (used by Nginx).

2.4.1 Nginx/Tengine

Key tuning points:

Set worker processes to match CPU cores.

Adjust keepalive timeout appropriately.

Increase worker_rlimit_nofile for more file descriptors.

Enable HTTP/1.1 keepalive in upstream.

2.4.2 Tomcat

Two main configuration areas:

JVM parameters – -Xms, -Xmx, -Xmn, -XX:PermSize, -XX:MaxPermSize, -Xss or -XX:ThreadStackSize.

Connector parameters – protocol (bio/nio/apr, prefer apr), connectionTimeout, maxThreads, minSpareThreads, acceptCount, maxConnection (for nio/apr).

Increasing threads or connections indiscriminately can degrade performance due to CPU scheduling overhead and memory consumption; a balanced setting is essential.

2.5 System Architecture

2.5.1 Load Balancing

Two categories:

Hardware load balancers (e.g., F5) – high performance but costly.

Software load balancers – layer‑4 (LVS) and layer‑7 (Nginx, HAProxy). LVS modes: NAT, DR, IP‑TUNNEL.

2.5.2 Sync vs Async

Choosing synchronous or asynchronous processing can trade off response time against consistency; asynchronous handling of non‑critical steps often improves throughput.

2.5.3 The 28 Principle

Roughly 20% of features generate 80% of traffic; focus optimization on the high‑traffic 20% while keeping the rest simple.

3. Typical Architecture

A common Java backend stack looks like:

Backend architecture diagram
Backend architecture diagram

The dashed part represents the database layer, usually a master‑slave setup, which can be replaced by Redis clusters, MySQL clusters (Cobar, etc.), or other distributed solutions.

Author: 飒然Hang, architect/backend engineer, working@中华万年历 Source: http://superhj1987.github.com
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

System optimizationLinuxLoad TestingNginxTomcat
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.