Why Tomcat Thread‑Pool Saturation Crashed Our Service and How to Avoid It
A detailed post‑mortem explains how a sudden traffic surge, insufficient pod count, and a custom thread‑pool bottleneck caused Tomcat thread‑pool saturation, health‑check failures, and a zone‑wide outage, and offers concrete lessons on capacity planning, monitoring, and safe coding practices.
