Performance Optimization of a High‑Concurrency Web Service: From Bottleneck Identification to TCP Time‑Wait Tuning

This article documents a step‑by‑step performance optimization of a high‑traffic web module, covering requirement analysis, bottleneck detection in database and TCP connections, cache integration, load‑testing results, Linux kernel parameter tuning, and the final achievement of 50k QPS with sub‑60 ms latency.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
Performance Optimization of a High‑Concurrency Web Service: From Bottleneck Identification to TCP Time‑Wait Tuning

The article records a performance‑optimization case study for a high‑concurrency web module that was split from a main site to avoid overloading the core services. The original requirements demanded at least 30 k QPS, database usage under 50 %, server load under 70 %, request latency under 70 ms, and error rate below 5 %.

Initial analysis highlighted that the system heavily relied on database reads and writes for popup configuration and user interaction tracking, leading to connection exhaustion and high error rates when QPS approached 6 k. The author emphasized that optimization must be goal‑driven and aligned with real business needs.

To alleviate the database bottleneck, a Redis‑based FIFO queue was introduced to offload write operations, and later the entire configuration data was cached, falling back to the database only on cache misses. Architectural diagrams (original images) illustrate the before‑and‑after designs.

Load‑testing with Locust showed that after the first round of caching the QPS plateaued around 20 k, CPU usage 60‑80 %, and database connections stabilized at ~300, but TCP connection limits still constrained further scaling.

Investigation revealed that many TCP sockets remained in TIME‑WAIT state, preventing reuse. The author adjusted Linux kernel parameters to reduce TIME‑WAIT buckets and enable fast recycling and reuse:

# timewait count, default 180000
net.ipv4.tcp_max_tw_buckets = 6000

net.ipv4.ip_local_port_range = 1024 65000

# enable fast recycle
net.ipv4.tcp_tw_recycle = 1

# enable reuse of TIME‑WAIT sockets
net.ipv4.tcp_tw_reuse = 1

After applying these settings, the service sustained 50 k QPS, CPU around 70 %, normal database connections, TCP connections without blockage, average response time 60 ms, and zero error rate.

The conclusion stresses that web development performance is a multidisciplinary engineering problem involving application code, databases, caching, operating‑system networking, and that developers need solid fundamentals across these areas to diagnose and resolve issues effectively.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cachingTCPLoad Testing
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.