Backend Development 20 min read

Understanding System Load Capacity and Optimization Techniques for High-Concurrency Backend Applications

This article explains how to measure system load capacity using requests per second, identifies key factors such as bandwidth, hardware, Linux limits, and application server configurations, and provides practical optimization tips for Nginx, Tomcat, MySQL, Redis, and load‑balancing architectures.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding System Load Capacity and Optimization Techniques for High-Concurrency Backend Applications

In the Internet era, high concurrency is a common concern; the ability to handle peak requests is a key performance metric for web sites and apps.

1. Measurement Metrics

The primary metric is Requests per Second (RPS), representing the number of successful responses per second. As concurrent users increase, RPS rises until a saturation point where further concurrency reduces RPS and increases latency; this point indicates the system’s maximum load capacity.

2. Influencing Factors

Key factors include bandwidth, hardware configuration, system configuration, application server settings, program logic, and overall architecture. Bandwidth and hardware are decisive; the article focuses on maximizing load within given bandwidth and hardware.

2.1 Bandwidth

Bandwidth (Mbps) determines data transmission speed, analogous to pipe size.

2.2 Hardware Configuration

CPU frequency/cores, memory size and speed, and disk speed (SSD vs HDD) directly affect load capacity.

2.3 System Configuration (Linux)

File descriptor limits, process/thread limits, and TCP kernel parameters are critical. Example commands:

echo 1000000 > /proc/sys/fs/file-max
fs.file-max = 1000000

Process limits can be viewed/changed with ulimit -n and ulimit -u . TCP state counts can be inspected with:

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'

Important TCP tuning parameters include:

net.ipv4.tcp_syncookies = 1

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_fin_timeout = 30

2.4 Application Server Configuration

Common models: multi‑process, prefork, worker, and master/worker (event‑driven). Nginx/Tengine uses the master/worker model, suitable for I/O‑bound workloads, while Apache/Tomcat use process or thread models for CPU‑bound tasks.

2.4.1 Nginx/Tengine

Key settings: match worker count to CPU cores, appropriate keepalive timeout, increase worker_rlimit_nofile , enable HTTP/1.1 keepalive upstream.

2.4.2 Tomcat

JVM options (Xms, Xmx, Xmn, -Xss) and connector settings (protocol, connectionTimeout, maxThreads, minSpareThreads, acceptCount, maxConnection) must be tuned. Over‑tuning threads can waste CPU and memory.

2.5 System Architecture

Typical Java backend stack: LVS + Nginx + Tomcat + MySQL/DBCluster + Redis/Codis, with load balancing, synchronous/asynchronous processing, and the “28 principle” (20% of features generate 80% of traffic).

2.5.1 Load Balancing

Hardware (e.g., F5) vs software solutions. Software includes layer‑4 (LVS) and layer‑7 (Nginx) balancers; LVS modes: NAT, DR, IP‑TUNNEL.

2.5.2 Sync vs Async

Choosing synchronous or asynchronous processing can improve response time for non‑critical operations.

2.5.3 28 Principle

Design effort should focus on the 20% of features that handle the majority of load.

BackendPerformance TuningLinuxMySQLNginxTomcatsystem load
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.