What’s the Difference Between HTTP 502, 503, and 504? A Guide for Ops Engineers
This article explains the HTTP 5xx status codes 502, 503, and 504, detailing their definitions, typical trigger scenarios, step‑by‑step troubleshooting flows, practical Bash scripts, comparison tables, real‑world case studies, and monitoring/alerting configurations to help operations engineers quickly pinpoint and resolve these errors.
HTTP 5xx Status Code Overview
The HTTP status code space is divided into classes; 5xx indicates server‑side errors. The three codes most relevant to gateway problems are:
502 Bad Gateway
503 Service Unavailable
504 Gateway TimeoutTypical Nginx configuration for detailed logging and custom error pages:
log_format detailed '$remote_addr - $remote_user [$time_local] "\$request" $status $body_bytes_sent "\$http_referer" "\$http_user_agent" rt=$request_time uct="\$upstream_connect_time" uht="\$upstream_header_time" urt="\$upstream_response_time"';
access_log /var/log/nginx/detailed.log detailed;
error_page 502 503 504 /50x.html;
location = /50x.html { root /usr/share/nginx/html; internal; }1. 502 Bad Gateway
Definition
Returned when the gateway or proxy receives an invalid response from the upstream server.
Typical Trigger Scenarios
Backend service not started (e.g., systemctl status php-fpm).
Incorrect upstream port configuration.
Backend process crash or OOM kill.
Connection exhaustion (e.g., pm.max_children limit reached).
Diagnostic Flowchart
502 error occurs
├─ Step 1: Verify Nginx can connect to backend (telnet, nc, ss)
├─ Step 2: Check backend service status (systemctl, ps)
├─ Step 3: Inspect backend resources (logs, dmesg, free)
└─ Step 4: Review Nginx error logsPractical Bash Script (check_502.sh)
#!/bin/bash
# 502 quick‑check script
echo "=========================================="
echo " 502 error check"
echo "=========================================="
# Nginx status
systemctl is-active nginx && echo "✓ Nginx running" || echo "✗ Nginx stopped"
ss -tlnp | grep :80
# PHP‑FPM status
systemctl is-active php-fpm && echo "✓ PHP‑FPM running" || echo "✗ PHP‑FPM stopped"
ps aux | grep -E "php-fpm|php-cgi" | grep -v grep
# Port listening
ss -tlnp | grep -E ":80|:9000|:9001"
# Recent Nginx error log
tail -20 /var/log/nginx/error.log
# PHP‑FPM config snippet
cat /etc/php-fpm.d/www.conf | grep -E "^pm|^max_children|^request_terminate"
# Restart services if needed
sudo systemctl restart php-fpm
sudo systemctl restart nginx
echo "=========================================="
echo " Check completed"
echo "=========================================="Real‑World Case – Backend Crash (OOM)
Symptom : Intermittent 502 errors.
Investigation :
# Check Nginx error log for connection refused
tail -100 /var/log/nginx/error.log | grep 502
# Verify PHP‑FPM status
systemctl status php-fpm
# Look for OOM kills
dmesg | grep -i "out of memory"
free -h
# Review PHP‑FPM pool settings
cat /etc/php-fpm.d/www.conf | grep -E "^pm|^max_children|^request_terminate"Root Cause : PHP‑FPM workers exhausted memory and were killed by the OOM killer.
Fix :
# Temporary restart
sudo systemctl start php-fpm
# Adjust pool configuration
[www]
pm = dynamic
pm.max_children = 20
pm.start_servers = 3
pm.min_spare_servers = 2
pm.max_spare_servers = 5
pm.max_requests = 200
php_admin_value[memory_limit] = 128M
# Apply changes
sudo systemctl restart php-fpm
sudo systemctl restart nginx2. 503 Service Unavailable
Definition
Returned when the server is temporarily unable to handle the request, usually due to overload or maintenance.
Typical Trigger Scenarios
Backend deliberately returns 503 (maintenance mode).
Rate limiting via limit_req exceeds the configured rate.
Connection count limits via limit_conn are hit.
Backend overload – worker processes exhausted.
Rate‑Limiting Example
# Nginx rate‑limit configuration
limit_req_zone $binary_remote_addr zone=one:10m rate=10r/s;
server {
listen 80;
location / {
limit_req zone=one burst=20 nodelay;
proxy_pass http://backend;
}
error_page 503 /503.html;
location = /503.html { root /usr/share/nginx/html; internal; }
}Diagnostic Flowchart
503 error occurs
├─ Determine source (Nginx vs backend)
│ ├─ Check response headers (curl -I)
│ └─ Identify origin
├─ If Nginx returned 503
│ ├─ Review limit_req and limit_conn settings
│ └─ Verify maintenance flag
└─ If backend returned 503
├─ Inspect backend load and logs
└─ Check worker pool statusReal‑World Case – Rate‑Limiting Too Strict
Symptom : During a promotion many users receive 503.
Investigation :
# Find limit_req configuration
grep -r "limit_req" /etc/nginx/
# Look for limiting log entries
tail -100 /var/log/nginx/error.log | grep "limiting"
# Check PHP‑FPM status page
curl http://127.0.0.1/statusRoot Cause : limit_req_zone set to 10r/s, insufficient for traffic spike.
Fix – increase zone size and rate:
# Updated Nginx config
limit_req_zone $binary_remote_addr zone=one:100m rate=100r/s;
limit_req_zone $binary_remote_addr zone=api:50m rate=50r/s;
server {
listen 80;
location / { limit_req zone=one burst=200 nodelay; proxy_pass http://backend; }
location /api/ { limit_req zone=api burst=50 nodelay; proxy_pass http://api_backend; }
location /static/ { limit_req zone=one burst=500; proxy_pass http://static_backend; expires 7d; }
error_page 503 @maintenance;
location @maintenance { root /var/www; rewrite ^(.*)$ /maintenance.html break; }
}3. 504 Gateway Timeout
Definition
Returned when the gateway does not receive a timely response from the upstream server.
Typical Trigger Scenarios
Backend processing takes too long (slow business logic).
Slow database queries (MySQL slow‑query log).
Proxy or FastCGI timeout settings are too low.
Large file uploads exceeding timeout limits.
Timeout Configuration Example
# /etc/nginx/nginx.conf – global timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
fastcgi_connect_timeout 60s;
fastcgi_send_timeout 60s;
fastcgi_read_timeout 60s;
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend;
proxy_connect_timeout 300s;
proxy_read_timeout 300s;
}
location /upload/ {
client_max_body_size 100m;
proxy_read_timeout 600s;
}
error_page 502 503 504 /50x.html;
location = /50x.html { root /usr/share/nginx/html; internal; }
}Diagnostic Flowchart
504 error occurs
├─ Identify which timeout fired (proxy_read_timeout, fastcgi_read_timeout, PHP max_execution_time)
├─ Check backend logs (PHP‑FPM slow log, application logs, DB slow query log)
├─ Examine backend performance (CPU, memory, connection pool)
└─ Optimize: increase timeout, improve code, use async processingReal‑World Case – Slow MySQL Query
Symptom : API calls frequently time out with 504.
Investigation :
# Nginx error log for timeout
grep 504 /var/log/nginx/error.log | tail -20
# PHP‑FPM slow log
tail -50 /var/log/php-fpm/www-slow.log
# MySQL slow query log
mysql -u root -p -e "SHOW PROCESSLIST;"
mysql -u root -p -e "SHOW VARIABLES LIKE 'slow_query%';"
cat /var/log/mysql/slow.log | tail -20Root Cause : Full‑table scan on huge_table caused queries >30 s.
Fix – add index and paginate results:
// Optimised PHP query with pagination
$page = isset($_GET['page']) ? (int)$_GET['page'] : 1;
$perPage = 100;
$offset = ($page - 1) * $perPage;
$stmt = $conn->prepare("SELECT * FROM huge_table WHERE created_at < ? ORDER BY id LIMIT ? OFFSET ?");
$stmt->bind_param("sii", $date, $perPage, $offset);
$stmt->execute(); -- Add index on created_at
ALTER TABLE huge_table ADD INDEX idx_created_at (created_at);
-- Verify with EXPLAIN
EXPLAIN SELECT * FROM huge_table WHERE created_at < '2026-01-01';4. Comparative Summary of 502, 503, 504
Meaning
502 – gateway received an invalid response.
503 – service temporarily unavailable.
504 – gateway timed out waiting for a response.
Problem Location
502 – backend connection or crash.
503 – rate‑limit, overload, or maintenance.
504 – backend processing too slow.
Nginx View
502 – connection failure.
503 – connection succeeded but request rejected.
504 – connection succeeded but response timed out.
Common Causes
502 – backend not started, wrong port, OOM kill.
503 – strict limit_req, worker exhaustion, maintenance flag.
504 – slow query, long‑running request, insufficient timeout.
Resolution Direction
502 – check and restart backend services.
503 – scale, adjust limits, disable maintenance.
504 – optimise backend code, increase timeouts, use async processing.
5. Monitoring & Alerting
5.1 Bash Script for 5xx Error‑Rate Monitoring
#!/bin/bash
LOG_FILE="/var/log/nginx/access.log"
ALERT_THRESHOLD=5
current_minute=$(date +"%d/%b/%Y:%H:%M")
total_requests=$(grep "$current_minute" "$LOG_FILE" | wc -l)
error_5xx=$(grep "$current_minute" "$LOG_FILE" | awk '$9 ~ /^5[0-9][0-9]$/' | wc -l)
if [ $total_requests -gt 0 ]; then
error_rate=$(echo "scale=2; $error_5xx * 100 / $total_requests" | bc)
if (( $(echo "$error_rate > $ALERT_THRESHOLD" | bc -l) )); then
echo "⚠️ Alert: 5xx error rate $error_rate% exceeds $ALERT_THRESHOLD%"
# Hook to Prometheus/Zabbix alerting here
fi
fi5.2 Prometheus Alert Rules (excerpt)
groups:
- name: nginx_5xx_alerts
rules:
- alert: NginxHigh502ErrorRate
expr: |
sum(rate(nginx_http_requests_total{status=~"502"}[5m]))
/ sum(rate(nginx_http_requests_total[5m])) * 100 > 5
for: 2m
labels:
severity: critical
annotations:
summary: "Nginx 502 error rate too high"
description: "502 error rate > 5% (current: {{ $value }}%)"
- alert: NginxHigh503ErrorRate
expr: |
sum(rate(nginx_http_requests_total{status=~"503"}[5m]))
/ sum(rate(nginx_http_requests_total[5m])) * 100 > 5
for: 2m
labels:
severity: warning
annotations:
summary: "Nginx 503 error rate too high"
description: "503 error rate > 5% (current: {{ $value }}%)"
- alert: NginxHigh504ErrorRate
expr: |
sum(rate(nginx_http_requests_total{status=~"504"}[5m]))
/ sum(rate(nginx_http_requests_total[5m])) * 100 > 5
for: 2m
labels:
severity: warning
annotations:
summary: "Nginx 504 error rate too high"
description: "504 error rate > 5% (current: {{ $value }}%)"6. Quick Checklist for 5xx Incidents
# 1. Verify Nginx is running
systemctl is-active nginx && echo "✓ Nginx running" || echo "✗ Nginx stopped"
# 2. Verify backend services (php-fpm, node, java, etc.)
for svc in php-fpm php80-php-fpm php74-php-fpm node java python; do
systemctl is-active $svc && echo "✓ $svc running" || echo "✗ $svc stopped"
done
# 3. Check key ports
ss -tlnp | grep -E ":80|:443|:8080|:9000|:9001"
# 4. Recent 5xx statistics (last 5 minutes)
current_time=$(date +"%d/%b/%Y:%H:%M")
grep "$current_time" /var/log/nginx/access.log | awk '$9 ~ /^5[0-9][0-9]$/' | sort | uniq -c
# 5. Review Nginx error log (last 20 lines)
tail -20 /var/log/nginx/error.log
# 6. Check PHP‑FPM status page (if configured)
curl -s http://127.0.0.1/status || echo "PHP‑FPM status page not configured"
# 7. Current connection count
ss -ant | wc -lFollowing this systematic approach lets you quickly pinpoint whether the failure lies in Nginx configuration, backend availability, resource exhaustion, or application performance, and apply the appropriate remediation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Golang Shines
We share daily the latest Golang technical articles, practical resources, language news, tutorials, and real-world projects to help everyone learn and improve.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
