How to Tame High Concurrency: Cutting Web Server Memory and CPU Usage
This article explains why modern web systems face exploding concurrent connections, how richer page interactions and higher browser limits increase server load, and presents front‑end caching, request merging, Apache/Nginx memory‑saving modes, sendfile, and epoll techniques to reduce both memory and CPU consumption.
1. Increasing number of concurrent connections
Web systems now face exponentially growing concurrent connections, making high concurrency a norm. Simply adding servers or upgrading hardware is costly; technical optimizations are more effective.
Concurrency growth is driven not by user base but by richer, more complex web pages and interactions.
Page elements increase, interactions become complex
Modern pages contain hundreds of resources; e.g., www.qq.com loads ~244 requests per refresh, plus periodic queries.
Persistent HTTP keep‑alive connections reduce connection churn but occupy server resources when idle, and some services (e.g., WebSocket) require long‑lived connections.
Browser connection limits rise
Browsers now allow 2‑6 parallel connections per domain, accelerating page loads but also increasing backend load, especially during traffic peaks.
2. Front‑end optimizations to relieve server pressure
Reducing HTTP requests and leveraging caching (Expires/Max‑Age, LocalStorage) can eliminate many server hits, though first‑time users and real‑time data are affected.
Conditional requests using Last‑Modified or ETag let servers respond with 304 Not Modified, avoiding full data transfer.
Merging page requests
Older static page generation avoided many Ajax calls. On mobile networks, merging resources—embedding CSS/JS, batching Ajax, using CSS sprites—reduces request count and improves performance.
3. Saving server memory
Memory is critical; Apache’s evolution illustrates memory‑saving strategies.
prefork MPM
Multiple processes each hold a full copy of memory, limiting scalability under high concurrency.
worker MPM
Combines few processes with many threads, sharing memory and reducing footprint, but introduces thread‑safety concerns.
event MPM
Uses an extra thread to manage keep‑alive connections, freeing worker threads for real requests, achieving the lowest memory usage.
Lightweight Nginx
Nginx serves many connections with a single process, using far less memory than Apache.
sendfile
sendfile bypasses user‑space copying, reducing both memory and CPU overhead.
4. Saving server CPU
Context switches and I/O multiplexing consume CPU. Early Apache used select/poll, which scans all descriptors and scales poorly.
Epoll
Epoll registers interest in sockets and notifies only active ones, greatly reducing CPU work.
Thread/process creation and context switching also add overhead; Nginx’s single‑process‑multiple‑worker model mitigates this.
Locks introduced for thread safety increase CPU usage and complexity; PHP‑FPM avoids this by using multiple processes.
5. Summary
While Nginx + PHP‑FPM often appears most resource‑efficient, the optimal stack depends on specific business requirements. Ongoing web‑server evolution strives to handle more requests with fewer resources, offering valuable techniques for developers.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
