How to Diagnose and Fix 502 Bad Gateway Errors in Nginx
This article explains what a 502 Bad Gateway response means, how Nginx as a reverse‑proxy generates it, common root causes such as upstream timeouts or crashes, and step‑by‑step methods to locate the problem using logs, monitoring data and configuration checks.
HTTP status codes
200 indicates success, 4xx indicates a client‑side error, and 5xx indicates a server‑side problem. When a 5xx is returned the component that actually generates it is often a gateway or reverse proxy such as nginx , not the application itself.
nginx as reverse proxy and load balancer
nginx sits between the client and a pool of backend servers. For each client request it opens a separate TCP connection to an upstream server, turning one client‑nginx connection into two TCP connections.
If the upstream does not produce a valid response, nginx translates the TCP failure (RST or FIN) into a 5xx response, most commonly 502 Bad Gateway. Therefore the backend may have no 502 entries while nginx logs contain the error.
RFC 7231 definition of 502
502 Bad Gateway – the server, while acting as a gateway or proxy, received an invalid response from an inbound server.
Typical causes of 502
Upstream timeout : the backend aborts the connection before sending a response. Example in Go: WriteTimeout = 2s while the handler needs 5 s. nginx sees a FIN/RST and returns 502.
Upstream crash or process exit : the backend process exits (OOM, log.Fatalln(), os.Exit(), etc.). The kernel replies with RST, which nginx maps to 502.
Stale upstream registration : service‑registration mechanism fails to update nginx configuration, so nginx forwards to an IP where no process is listening. The kernel sends RST and nginx returns 502.
TCP reset (RST) and FIN
RST is a TCP flag used to abort a connection abruptly; the receiving side sees “connection reset” or “connection refused”. FIN is the normal four‑way‑handshake termination. Both can be interpreted by nginx as an invalid upstream response.
Investigation workflow
Check server‑side monitoring (CPU, memory) for sudden drops that indicate a crash or OOM kill.
Verify the backend process start time: ps -o lstart <pid> Compare the timestamp with the incident time; a mismatch suggests a restart.
Search application logs for stack traces. Absence of a stack trace may mean the process was killed by the kernel (OOM) or exited via os.Exit().
Inspect nginx access and error logs for the upstream IP and port. Look for mismatched or stale IPs that do not correspond to running instances.
If the backend timed out, examine the HTTP server timeout settings (e.g., Go WriteTimeout, ReadTimeout) and increase them if the handler legitimately needs more time.
When a crash is suspected, locate the crash stack trace (e.g., Go panic output) to identify programming errors such as nil‑pointer dereference or out‑of‑bounds access.
For OOM‑related exits, check system logs for Killed process … out of memory messages.
Example commands and configuration
Check process start time:
# ps -o lstart 13515
STARTED
Wed Aug 31 14:28:53 2022Typical nginx upstream block (file /etc/nginx/nginx.conf) with weighted servers:
upstream xiaobaidebug.top {
server 10.14.12.19:9235 weight=2;
server 10.14.16.13:8145 weight=5;
server 10.14.12.133:9702 weight=8;
server 10.14.11.15:7035 weight=10;
}When service registration is dynamic, the upstream block must be regenerated and nginx reloaded; otherwise stale entries cause 502.
Summary of diagnostic steps
502 usually originates from nginx receiving an invalid TCP response (RST/FIN) from the upstream.
First verify whether the backend process crashed or timed out; adjust timeout parameters (e.g., increase WriteTimeout) if needed.
If the backend is healthy, examine nginx configuration and logs for stale or missing upstream entries.
Use monitoring data and ps -o lstart to detect unexpected restarts; check for OOM kills or explicit exits.
Linux Tech Enthusiast
Focused on sharing practical Linux technology content, covering Linux fundamentals, applications, tools, as well as databases, operating systems, network security, and other technical knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
