How to Tame High Concurrency: Front‑End Tricks and Server Optimizations

This article examines why modern web systems face exploding concurrent connections, explains how richer pages and browser limits increase load, and presents front‑end caching, request merging, Apache/Nginx memory and CPU optimizations, and practical guidelines for reducing server resource consumption.

21CTO
21CTO
21CTO
How to Tame High Concurrency: Front‑End Tricks and Server Optimizations

1. Increasing Number of Concurrent Connections

Web systems have seen an exponential rise in concurrent connections in recent years, making high concurrency a norm and posing significant challenges. Simply adding more servers or upgrading hardware is costly; technical optimizations are more effective.

The growth is not driven by user base size but by richer, more interactive pages. Modern pages contain many resources; for example, a single refresh of www.qq.com generates about 244 requests, plus periodic background queries.

Long‑living keep‑alive connections reduce the overhead of repeatedly creating sockets, but idle connections waste server resources. Some applications, such as WebSockets, inherently require persistent connections.

Browsers have also increased their per‑origin connection limits from 1‑2 to 2‑6, accelerating the pressure on back‑end servers, especially during peak traffic when many connections consume CPU and memory.

2. Front‑End Optimizations to Reduce Server Load

Effective mitigation of high concurrency requires collaboration between front‑end and back‑end.

2.1 Reduce Web Requests – Use HTTP cache headers (Expires, Max‑Age) to store static content in the browser, and leverage HTML5 LocalStorage for additional caching. This eliminates many requests, dramatically lowering server load, though it does not help first‑time visitors and may affect real‑time data.

2.2 Lighten Web Requests – Employ conditional requests with Last‑Modified or ETag. If content is unchanged, the server returns a 304 Not Modified, avoiding full data transfer.

2.3 Merge Page Requests – Embed CSS/JS directly into HTML, batch multiple Ajax calls into a single request, and combine small images using CSS sprites. These techniques reduce the number of separate HTTP transactions and thus the number of connections the server must handle.

3. Saving Server Memory

After front‑end tuning, focus on the server itself. Memory is a critical resource for handling concurrent connections.

Apache’s evolution illustrates memory‑saving strategies:

Prefork MPM – Stable but each child process duplicates the parent’s memory, limiting the number of processes under high concurrency.

Worker MPM – Mixes processes and threads; threads share memory, reducing overall usage but introducing thread‑safety concerns.

Event MPM – Adds an epoll‑based thread to manage idle keep‑alive connections, further cutting memory consumption.

Lightweight servers like Nginx use a single process to handle many connections, naturally using less memory than Apache.

System calls such as sendfile and memory‑mapped I/O (MMP) avoid copying data between kernel and user buffers, saving both memory and CPU cycles.

4. Saving Server CPU

CPU usage is impacted by I/O multiplexing and thread/process management.

Select/Poll – Early Apache versions poll all sockets, causing high CPU load when most sockets are idle.

Epoll – Registers callbacks for ready sockets, eliminating unnecessary polling and reducing CPU overhead.

Thread and process creation, context switching, and lock contention also consume CPU. Apache’s worker and event modes use threads, which require careful lock management to avoid deadlocks and starvation. PHP‑FPM runs as multiple processes, sidestepping thread‑safety issues.

5. Conclusion

While Nginx + PHP‑FPM often appears the most resource‑efficient setup, the optimal architecture depends on specific business requirements. Continuous evolution of web servers aims to support more requests with fewer system resources, offering valuable techniques for developers to study and apply.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Web Performancehigh concurrencyCPUserver memoryApachebackend optimization
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.