Zero‑Downtime Deployments with Gunicorn and uWSGI: Reload Strategies & Scripts
Learn how to achieve zero‑downtime application upgrades by using Gunicorn’s HUP reload, uWSGI’s chained reload, load‑balancer node draining, and custom health‑check scripts that coordinate process restarts, ensuring minimal service interruption even for apps with long startup times.
When you first deploy an application, the simplest method is to restart the entire service or the whole cluster as an administrator, which works initially but soon leads to many HTTP 503 errors for clients trying to connect during the restart.
Gunicorn and uWSGI can reload the application without closing the listening socket, so requests are only delayed briefly. This works well if the app starts quickly, but many applications take up to a minute to start, which is too long for connections waiting on the socket.
Gunicorn performs a reload by sending kill -HUP $PID , which shuts down all worker processes before starting new ones, often causing delays due to slow worker initialization. uWSGI uses a chained reload, starting one worker at a time, but it lacks good support for Tornado.
Using a Load Balancer
A common technique is to remove a single server from the load balancer, upgrade or restart the application on that node, and then add it back. We use HAProxy to manage socket draining. Instead of deploying to all nodes simultaneously, we now deploy node‑by‑node. While the node is out of the pool, a temporary 404 page can be served to satisfy health checks. The interval between two failed health checks is about five seconds, which includes the time for the web process to recover after the upgrade.
Gunicorn Reload ++
Gunicorn automatically restarts failed web workers, which may kill each process and wait until all child processes finish. This works, but if the number of restarts varies significantly, you either wait too long for a restart or accept a higher risk of downtime.
Because Gunicorn provides a Python hook for the application, you can write a small piece of code that notifies a restart manager when a worker is ready. Although Gunicorn does not include this hook out of the box, adding it is straightforward.
When a restart occurs, the socket continues to accept connections across multiple processes, reducing service capacity by only 1/N, allowing traffic to be handled without long client waits.
<code>for child_pid of gunicorn-master:
kill child_pid
wait for app startup
</code>My first version used a shell script and nc to listen for a UDP packet indicating the app had started. Integrating this process manager into the shell environment was a bit more complex than expected, but it works well.
The restart script should be invoked with the Gunicorn master PID, e.g., masterrestart.sh $PID .
<code>echo 'Killing children of ' $1;
children=$(pgrep -P $1)
for child in $children
do
echo 'Killing' $child
kill $child
response=$(timeout 60 nc -w 0 -ul 4012)
if [ "$response" != '200 OK' ]; then
echo 'BROKEN'
exit 1;
fi
done
</code>We chain a post_worker_init script so that the app notifies the restart script when it is ready.
<code>import socket
import time
def post_worker_init(worker):
_send_udp('200 OK\n')
def _send_udp(message):
udp_ip = "127.0.0.1"
udp_port = 4012
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.sendto(message, (udp_ip, udp_port))
# Example WSGI application
from werkzeug.wrappers import Request, Response
@Request.application
def application(request):
resp = Response('Hello World!')
if request.path == '/_status':
resp.status = '200 OK'
else:
resp.status = '404 Not Found'
return resp
</code>We can also perform a health‑check by requesting the /_status endpoint to verify that the application is running.
<code>def post_worker_init(worker):
env = {
'REQUEST_METHOD': 'GET',
'PATH_INFO': '/_status',
}
def start_response(*args, **kwargs):
_send_udp(args[0])
worker.wsgi(env, start_response)
</code>Be careful not to run too many checks in this health‑check; if post_worker_init raises an error, the worker will exit and prevent the app from starting, which can be problematic when database connections are flaky.
With a one‑minute application startup time, we have achieved rolling restarts without stopping the service or dropping any connections.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.