Operations 9 min read

Boosting a Python Service to 50k QPS: My Step‑by‑Step Performance Tuning

Through a detailed case study, the author documents the process of optimizing a Python‑based web module—identifying bottlenecks, redesigning architecture with Redis queues, tuning MySQL, adjusting Linux TCP settings, and iteratively load‑testing until achieving 50,000 QPS with sub‑70 ms latency and zero errors.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Boosting a Python Service to 50k QPS: My Step‑by‑Step Performance Tuning

Introduction

This article records a Python program's performance optimization journey, sharing the problems encountered and the solutions applied. The author emphasizes that the presented method is not the only one and that many alternative solutions exist for similar performance challenges.

Requirements

The module, originally part of a main site, was split out because of high concurrency. The service must meet the following criteria during pressure testing:

QPS ≥ 30,000

Database load ≤ 50%

Server load ≤ 70%

Single request latency ≤ 70 ms

Error rate ≤ 5%

Initial Environment

Server: 4‑core CPU, 8 GB RAM, CentOS 7, SSD storage

Database: MySQL 5.7, max connections 800

Cache: Redis with 1 GB capacity

Load‑testing tool: Locust, using Tencent elastic scaling for distributed testing

Problem Identification

Initial load tests showed QPS around 6,000, with 30% HTTP 502 errors, CPU fluctuating between 60‑70%, and database connections saturating at roughly 6,000 TCP connections. The bottleneck was traced to frequent database reads for user‑specific popup configurations, exhausting the 800 available MySQL connections.

First Optimization Attempt

To relieve the database, write operations were off‑loaded to a FIFO message queue implemented with a Redis list. The revised architecture is illustrated below:

After this change, load testing still hit a ceiling: QPS plateaued around 20,000, CPU 60‑80%, and database connections remained near 300 while TCP connections reached 15,000 per second.

Second Optimization: Cache All Configurations

All popup configurations were pre‑loaded into Redis. The database is queried only when a cache miss occurs. The updated architecture diagram:

Subsequent tests showed QPS climbing to about 20,000 but still limited by TCP connection handling.

TCP Time‑Wait Bottleneck

Investigation revealed that after the four‑way handshake, TCP connections remained in TIME‑WAIT state, preventing immediate reuse and exhausting socket resources.

TCP connections stay in TIME‑WAIT after termination to ensure delayed packets are not misinterpreted.

Kernel Parameter Tuning

Since Linux does not expose a direct parameter to shorten TIME‑WAIT, the author adjusted related sysctl settings:

# timewait count, default 180000
net.ipv4.tcp_max_tw_buckets = 6000

net.ipv4.ip_local_port_range = 1024 65000

# enable fast recycle and reuse
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

The command ulimit -n confirmed the file descriptor limit was 65,535, so the socket limit was raised to 100,001 before retesting.

Final Results

After the kernel tweaks, the service achieved 50,000 QPS, CPU usage around 70%, normal database connections, stable TCP connections, an average response time of 60 ms, and a 0% error rate.

Conclusion

The optimization journey highlighted the importance of holistic understanding across web development, networking, databases, and operating systems. Effective performance tuning requires solid fundamentals in each layer, as issues often span multiple subsystems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendperformanceoptimizationPythonOperationsTCP
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.