Avoidable P1 Outage: How Nginx Changes Caused All Gateway Requests to Return 400
A production change replaced two Nginx reverse‑proxy servers, introduced an upstream name containing an underscore, broke the Host header required by HTTP/1.1, and caused Spring Cloud Gateway to return 400 Bad Request for every request until the configuration was corrected.
Problem Background
The project uses Spring Cloud Gateway to receive external traffic, with two Nginx instances acting as reverse proxies in front of the gateway. The request flow is:
gateway domain → Nginx → F5 (load balancer) → gateway → backend services.
SRE P1 Incident
During a production change, the operations team swapped the original two Nginx machines for two new ones. Immediately after traffic was switched to the new Nginx, all gateway API calls returned HTTP 400. Rolling back to the old Nginx restored normal responses, and the incident was classified as a P1.
Investigation
Initial logs from the gateway showed no errors. Nginx logs revealed that OPTIONS requests returned 204, while GET/POST requests consistently returned 400. The team suspected malformed request bodies but could not obtain raw traffic captures from the network team.
Comparing the old and new Nginx configurations uncovered a key difference: the new configuration introduced an upstream http_gateways block and enabled HTTP/1.1 with proxy_http_version 1.1 and proxy_set_header Connection "".
Reproduction
In a test environment, the new Nginx was set to listen on port 9104 and proxy to the gateway on port 8081. A simple
curl -v -X GET http://10.100.8.11:9104/wechat-web/actuator/inforeturned 400. Removing proxy_http_version 1.1 and proxy_set_header Connection "" made the request succeed with 200.
Deep Investigation
Using tcpdump, the captured traffic showed the Host header value was http_gateways, which contains an underscore. According to RFC 2616, the Host header must be a valid hostname; underscores are not allowed, and servers must return 400 for such requests.
Documentation of Nginx’s proxy_set_header shows the default Host $proxy_host. When an upstream name contains an underscore, $proxy_host resolves to that name, producing an illegal Host header.
Spring Cloud Gateway’s request handling code catches URISyntaxException and responds with HttpResponseStatus.BAD_REQUEST when the Host cannot be parsed, which explains the observed 400 responses.
Solution
Because HTTP/1.1 requires a valid Host header, the fix is to override the default by adding proxy_set_header Host $host; to the Nginx server block. After updating the test configuration and re‑running the request, the response returned 200.
Alternative fixes include renaming the upstream to remove the underscore (e.g., http-gateways) or explicitly setting proxy_set_header Host "domain.com".
Resolution
The final Nginx configuration is:
upstream http_gateways{
server 10.100.22.48:8081;
keepalive 30;
}
server {
listen 9104 backlog=512;
server_name wmg.test.com;
add_header X-Frame-Options "SAMEORIGIN";
add_header X-Content-Type-Options "nosniff";
add_header Content-Security-Policy "frame-ancestors 'self'";
location / {
proxy_set_header Host $host;
proxy_hide_header host;
proxy_http_version 1.1;
proxy_set_header Connection "";
client_max_body_size 100m;
add_header 'Access-Control-Allow-Origin' "$http_origin" always;
add_header 'Access-Control-Allow-Credentials' 'true' always;
add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS, DELETE, PUT';
add_header 'Access-Control-Allow-Headers' '...';
if ($request_method = 'OPTIONS') { return 204; }
proxy_pass http://http_gateways;
}
}Both the upstream‑rename and explicit Host header approaches are valid alternatives.
Conclusion
The incident demonstrates the importance of validating configuration changes in a test environment, especially for protocol‑level requirements like the HTTP Host header. Proper testing and awareness of Nginx defaults can prevent similar P1 outages.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
