Operations 8 min read

Uncovering Hidden PHP‑CGI Deadlocks: Why Disk Space Stalls and How to Fix Them

A deep dive into a long‑standing PHP‑CGI deadlock that left deleted log files occupying disk space, explaining how signal‑unsafe functions caused the lock, how the issue was diagnosed with lsof, strace and gdb, and the practical steps to eliminate the deadlock.

21CTO
21CTO
21CTO
Uncovering Hidden PHP‑CGI Deadlocks: Why Disk Space Stalls and How to Fix Them

Problem discovery

Online machines reported disk‑space alarms even after log files were cleared. Inspection with ps aux | grep php-cgi showed many CGI processes that had been running for days, weeks, or months, far beyond the normal one‑day restart cycle, indicating a problem.

Using lsof -p [pid] revealed that these long‑running CGI processes kept file handles open to log files that had already been deleted, preventing disk space from being reclaimed. The root cause was identified as CGI processes not closing file handles.

Further analysis with strace -p [pid] showed that all abnormal processes were blocked in the fmutex state, confirming a deadlock. The deadlock prevented file handles from being closed, leading to the disk‑space anomaly.

Why did the CGI processes deadlock?

Although CGI processes are single‑threaded, the deadlock was not caused by multithreaded resource locking. Instead, the deadlock occurred during PHP’s shutdown phase when a signal handler invoked a non‑signal‑safe function.

Function reentrancy and signal safety

Re‑entrant functions can be called safely from signal handlers, but many thread‑safe functions use a global lock that is not signal‑safe. If a signal interrupts a thread while such a lock is held and the handler calls the same function, a deadlock ensues.

PHP‑CGI execution flow

The glibc time functions use a global lock for thread safety but lack signal safety. When a PHP‑CGI process receives a signal (e.g., SIGPROF) during execution of a time function, the lock remains held. The signal handler then calls the same time function, causing a deadlock.

All deadlocked CGI processes recorded the error message “Max execution timeout of 60 seconds exceeded”. The 60‑second timeout triggers a SIGPROF signal, which interrupts the process during a glibc time function call. The shutdown routine then invokes user‑defined shutdown functions that also call time functions, completing the deadlock cycle.

Relevant code snippet

void zend_set_timeout(long seconds)
{
    TSRMLS_FETCH();
    EG(timeout_seconds) = seconds;
    if (!seconds) {
        return;
    }
    // ...
    setitimer(ITIMER_PROF, &t_r, NULL);
    signal(SIGPROF, zend_timeout); // calls Zend timeout handler
    sigemptyset(&sigset);
    sigaddset(&sigset, SIGPROF);
    // ...
}

Debugging with gdb showed that all PHP‑CGI processes were blocked in zend_request_shutdown, which calls user‑registered shutdown functions. If a shutdown function accesses a non‑signal‑safe time function while the global lock is held, a deadlock occurs.

In this case, a custom shutdown hook registered via register_shutdown_function('SimpleWebSvc::shutdown') used a qalarm system that, during shutdown, called a time function, creating the deadlock scenario.

Conclusion

The deadlock was caused by invoking non‑signal‑safe functions (glibc time functions) inside a signal handler during PHP‑CGI shutdown.

Solution

Remove or simplify the qalarm hook registered in the shutdown function to avoid unsafe function calls.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsdeadlockLinuxsignal handlingCGI
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.