Investigation of Massive HTTP 406 Errors Caused by Incorrect Accept Header from a Search Engine Crawler
The article details a real‑world investigation of a surge of HTTP 406 errors on a Chinese e‑learning platform, tracing the issue to malformed Accept headers sent by a search‑engine crawler and illustrating how CDN routing and header negotiation can cause widespread client‑side failures.
Background: A teacher on the Hujiang platform could not log in from abroad, prompting the operations team to investigate an unusually high volume of HTTP 406 (Not Acceptable) responses logged over a 24‑hour period.
Initial analysis suggested a front‑end problem because the errors were tied to the client’s Accept header. Logs showed requests with Accept: text/html,application/xhtml+xml,application/xml; while normal browsers send Accept: */* .
By deploying an online patch to record full request details, the team captured both faulty and normal requests. Comparison confirmed that the malformed Accept header triggered the 406 responses, which was reproduced using Postman.
The team consulted the HTTP RFC and learned that a 406 occurs when the server cannot produce a response matching the Accept header. The API returned JSON, but the request claimed it only accepted HTML‑related MIME types, leading to the error.
Investigation of the source IPs revealed that the offending requests originated from a few Beijing Unicom nodes, suggesting CDN involvement. Further checks showed the IP belonged to a search‑engine crawler whose Accept header differed from standard browsers.
After temporarily bypassing the CDN for those nodes, the errors disappeared, confirming that the crawler’s requests were the root cause. Later the crawler changed its strategy, adding a custom User‑Agent identifier and restoring normal Accept handling.
Conclusion: Large numbers of 406 errors are often not caused by user traffic but by automated agents with non‑standard Accept headers. Developers should monitor Accept header variations and consider CDN header manipulation when debugging similar issues.
Additional technical notes on the Accept header:
• MIME types such as text/html , application/xhtml+xml , and application/xml specify the content formats a client can handle.
• The wildcard */* indicates the client accepts any media type.
• Multiple types can be listed, e.g., Accept: text/html,application/xhtml+xml,application/xml , allowing content negotiation.
• Quality factors (q) weight preferences, e.g., Accept: text/html;q=0.9,application/xhtml+xml;q=0.7,application/xml,*/*;q=0.5 , where higher q values indicate higher preference.
Hujiang Technology
We focus on the real-world challenges developers face, delivering authentic, practical content and a direct platform for technical networking among developers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.