Why Does MySQL’s KILL Command Sometimes Hang? A Deep Dive into Thread States and Kill Workflow
This article examines why MySQL’s KILL command often leaves a connection in a prolonged “Killed” state, analyzes the 5.7 source‑code flow, explains socket closure, thread flag handling, condition‑variable notifications, and reproduces real‑world cases that illustrate the underlying causes.
Background
When the MySQL KILL command is issued, the target thread often remains in a killed state for a noticeable period instead of terminating immediately. The following analysis examines the MySQL 5.7 code path for the kill command to explain this behaviour.
MySQL execution flow overview
The entry point is mysqld::main(), which starts a listener thread, accepts TCP connections and creates a THD (thread descriptor) for each client. A typical kill workflow is:
Run SHOW PROCESSLIST to obtain a connection id.
Execute KILL <id> to terminate that connection.
KILL workflow
The kill operation is performed inside THD::awake() and consists of three main steps:
Set the thread’s killed flag (or KILL_QUERY for query‑only termination).
Close the socket and interrupt any engine‑level waits.
Notify waiting threads via condition variables.
Key code excerpts illustrate each step.
* frame #0: 0x00000001068a8853 mysqld`THD::awake(this=0x00007fbede88b400, state_to_set=KILL_CONNECTION) at sql_class.cc:2029:27
frame #1: 0x000000010695961f mysqld`kill_one_thread(thd=0x00007fbed6bc9c00, id=2, only_kill_query=false) at sql_parse.cc:6479:14
frame #2: 0x0000000106946529 mysqld`sql_kill(thd=0x00007fbed6bc9c00, id=2, only_kill_query=false) at sql_parse.cc:6507:161. Setting the killed flag
if (this->m_server_idle && state_to_set == KILL_QUERY) {
/* nothing */
} else {
killed = state_to_set;
}If the thread is idle and the request is KILL_QUERY, the flag is not set to avoid affecting subsequent requests.
2. Closing the socket and interrupting engine waits
if (state_to_set != THD::KILL_QUERY && state_to_set != THD::KILL_TIMEOUT) {
if (this != current_thd) {
shutdown_active_vio();
}
MySQL_CALLBACK(Connection_handler_manager::event_functions, post_kill_notification, (this));
}
if (state_to_set != THD::NOT_KILLED) {
ha_kill_connection(this);
}For a full‑connection kill the socket is shut down, causing the client to receive a “Lost connection” error and start reconnect logic, which can amplify connection‑count spikes.
3. Notifying waiting threads
if (is_killable) {
mysql_mutex_lock(&LOCK_current_cond);
if (current_cond && current_mutex) {
mysql_mutex_lock(current_mutex);
mysql_cond_broadcast(current_cond);
mysql_mutex_unlock(current_mutex);
}
mysql_mutex_unlock(&LOCK_current_cond);
}The code acquires the condition‑variable lock, broadcasts to wake any thread waiting on current_cond, then releases the lock.
Why the “Killed” state can persist
The persistence is mainly caused by two situations:
Active‑check path: The target thread eventually reaches a point where it checks thd->killed (e.g., during row reads, index scans, or after finishing a command). If the thread is busy in CPU‑bound work, the check may be delayed.
Signal‑based wake‑up: If the thread is waiting inside InnoDB (e.g., on a row lock), the kill logic must broadcast a condition variable and invoke ha_kill_connection() to interrupt the wait. A missed broadcast leaves the thread in the wait state until a timeout or another wake‑up occurs.
Both paths may involve additional operations such as transaction rollback or temporary‑table cleanup, which further extend the time before the thread finally disappears from SHOW PROCESSLIST.
Root causes
After the kill command MySQL may need to:
Rollback large transactions.
Clean up temporary tables or slow DDL.
Perform error‑handling that triggers additional engine calls.
High server load can delay the active‑check of the killed flag, and the socket‑closure can trigger client reconnect loops that increase the apparent connection count.
Reproducing the issue locally
Four test cases were created to reproduce the problem:
Case 1: Connection 1 has finished its command; Connection 2 kills it – immediate termination.
Case 2: Connection 1 is still in the parse phase; Connection 2 kills it – still terminates quickly.
Case 3 (InnoDB wait): Connection 0 starts a transaction updating a row; Connection 1 updates the same row and waits on the row lock; Connection 2 kills Connection 1. The kill interrupts the lock wait via ha_kill_connection(), and the thread is awakened.
Case 4 (MySQL‑level wait): Connection 0 holds a metadata lock; Connection 1 attempts an UPDATE and blocks on the lock; Connection 2 kills Connection 1, which broadcasts the same condition variable that Connection 1 is waiting on, allowing it to proceed to rollback and finally exit.
Key stack traces for each case are captured in the original source code and illustrate the three‑step workflow.
Real‑world incident
In production a traffic surge caused many connections to accumulate. DBAs issued KILL commands, which temporarily reduced the connection count, but the count rose again because the killed threads remained in the killed state, the client SDK kept reconnecting, and InnoDB‑level concurrency limits prevented new threads from entering InnoDB to finish the kill.
Setting innodb_thread_concurrency to 1 reproduced the problem: one thread entered InnoDB and paused; a second thread blocked on the concurrency check; killing the second thread left it stuck in the killed state until the first thread exited InnoDB.
After the first thread finally left InnoDB, the second thread performed the active‑check, executed the rollback, and disappeared from SHOW PROCESSLIST.
Takeaways
The kill command first closes the socket, which can trigger client reconnect loops.
Threads inside InnoDB may not see the killed flag until they are awakened via condition variables or a timeout.
High server load and concurrency limits can delay both active checks and wake‑up notifications.
Understanding the three‑step workflow (flag set → socket shutdown → condition broadcast) helps diagnose why a connection appears stuck in the killed state.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
