Databases 16 min read

How We Cut 98% of Slow MySQL Queries by Tuning Thread‑Pool Settings

This article walks through a real‑world MySQL performance case at Meituan‑Dianping, showing how systematic monitoring, packet capture, and step‑by‑step analysis identified the Thread‑Pool as the bottleneck and how disabling and fine‑tuning its parameters reduced slow queries from thousands to just a few dozen.

dbaplus Community
dbaplus Community
dbaplus Community
How We Cut 98% of Slow MySQL Queries by Tuning Thread‑Pool Settings

Problem Description

Meituan‑Dianping, a large internet‑plus lifestyle platform, experienced intermittent timeouts in online services because certain SQL statements showed high latency (99th‑percentile response times often exceeding 100 ms, sometimes up to 1 s). The slow queries were simple primary‑key lookups on very small tables, yet they were not recorded in the MySQL slow‑log.

Overview Analysis

By examining CAT monitoring data, the author broke down the total response time into three components:

App Server – time to send the SQL and receive the result.

Network – transmission time of the request and result.

MySQL Server – time spent inside MySQL from receipt to result.

TcpDump was deployed at both the application and MySQL servers to capture timestamps at four monitoring points (APP → MySQL request, MySQL receipt, MySQL response, APP receipt). The measurements showed that the majority of the delay (≈606 ms) occurred inside MySQL.

Solution Iterations

Idea 1: Identify the Slowest Stage

Method: Use TcpDump to record timestamps at the four points listed above. Result: Most of the latency originated in MySQL (≈606 ms).

Idea 2: Decompose MySQL Processing Stages

Method: Refer to "High Performance MySQL" to map MySQL into three layers – connection/thread handling, the MySQL server layer (parser, optimizer, cache, etc.), and the storage‑engine layer. This clarified where the execution time could be spent.

Idea 3: Suspect Thread Management

Since the slow queries were not logged by the slow‑log, the problem must lie before the storage engine – likely in MySQL's thread handling.

Idea 4: Disable Thread‑Pool (MySQL 5.6)

Method: Change thread_handling from pool-of-threads to One-Connection-Per-Thread. Result: After disabling the Thread‑Pool, slow‑SQL count dropped from 3788 to 818 (≈78 % reduction).

Idea 5: Tune Thread‑Pool Parameters

Method: Investigate Thread‑Pool configuration parameters (e.g., thread_pool_stall_limit) and set thread_pool_stall_limit=10. Result: Slow‑SQL count fell from 3788 to 63, a 98.3 % reduction.

Problem Analysis

Before Thread‑Pool, MySQL used a one‑thread‑per‑connection model, which degrades performance under high connection counts. The Thread‑Pool introduced in MySQL 5.6 creates a pool of worker threads grouped into thread groups, each handling connections to keep a limited number of active threads and reduce context‑switch overhead.

Each thread group can be in one of four states: LISTENER, IDLE, ACTIVE, or WAITING. When a thread is waiting for I/O, it is marked WAITING; if a new request arrives while a thread is waiting, the pool may create or wake another thread depending on thread_pool_stall_limit (default 500 ms). Idle threads self‑terminate after thread_pool_idle_timeout (default 60 s).

How Thread‑Pool Works

A dedicated listener thread accepts new connections. In the one‑thread‑per‑connection model, the listener pulls a thread from the cache or creates a new one for each connection. In the Thread‑Pool model, the listener enqueues the connection into a thread‑group queue; a worker thread from that group then handles the connection, keeping the number of active threads per CPU core low.

Solution

Set thread_pool_stall_limit=10 so that threads blocked longer than 10 ms are marked stalled, prompting the pool to spawn additional workers and preventing other queries from waiting excessively.

Value Delivered

For the xxx‑service example, slow‑SQL count decreased from 3788 to 63 (98.3 % reduction).

The fix benefited hundreds of product lines, raising overall service availability above 99.99 %.

Summary

The case study demonstrates a systematic approach to diagnosing end‑to‑end database performance issues: collect monitoring data, isolate the latency‑heavy stage, use packet capture to pinpoint MySQL internals, and apply targeted configuration changes. By focusing on the Thread‑Pool introduced in MySQL 5.6, the team first disabled it to confirm the hypothesis, then fine‑tuned thread_pool_stall_limit, achieving dramatic reductions in slow queries and improving platform reliability.

References

https://my.oschina.net/andylucc/blog/820624

https://yq.aliyun.com/articles/41078

http://blog.chinaunix.net/uid-28364803-id-3427833.html

http://blog.chinaunix.net/uid-28364803-id-3431242.html

https://www.percona.com/doc/percona-server/5.6/performance/threadpool.html

https://mariadb.com/kb/en/mariadb/thread-pool-in-mariadb/

https://www.safaribooksonline.com/library/view/high-performance-mysql/9781449332471/ch01.html#mysqlas_logical_architecture

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuningmysqlDatabase OptimizationSlow QueriesThread-Pool
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.