HSF Thread Pool Exhaustion: Uncovering Init Errors and Reactive Stream Crashes
The article analyzes a critical service outage caused by missing initialization checks for a core dependency, leading to uncaught ExceptionInInitializerError, thread pool exhaustion, and unresponsive HSF services, and outlines the investigation steps, root cause, and three concrete remediation actions.
Conclusion
The root cause of the incident was the lack of effective validation and fallback for core dependency initialization errors during application startup. This caused uncaught ExceptionInInitializerError , which broke the reactive stream, blocked threads, exhausted the HSF thread pool, and rendered the service unavailable.
Specific Failure Chain
Rich package startup timed out due to network issues → No strong check on core module startup success → Requests reached the machine → Core class threw ExceptionInInitializerError → Upper layer caught only Exception, not the error → Mono stream terminated before onErrorResume could handle it → Subscribers could not consume the stream → blockGet without timeout caused infinite wait, blocking HSF biz threads → HSF thread pool filled, service stopped responding.
Background
One evening, a teammate reported that calls to the xxx service were failing with HSF thread pool full errors for over ten minutes. The service had historically handled hundreds of QPS with low latency, so the prolonged thread pool saturation was puzzling.
Monitoring showed the thread count remained at maximum for more than an hour.
Investigation Process
Thread Snapshot
Thread dump revealed most HSF biz threads were in a parked state without a timeout.
All threads showed the same stack, pointing to the initialization error.
Initial Diagnosis
The stack indicated the error originated from a PARKED state caused by an uncaught ExceptionInInitializerError . The problematic code was identified in the core class static initializer.
Further Findings
Only some requests that threw the error returned a response; others hung indefinitely. The hanging requests logged the following error:
Analysis of the LLM‑generated answer highlighted two main issues:
Using catch Exception cannot capture ExceptionInInitializerError .
Some reactive pipelines lacked onErrorResume, causing the stream to terminate.
The core class failed to load because Spring could not fetch the required XSD over the network, resulting in a java.net.SocketException: Network unreachable.
content: 2025-06-16 19:49:01.867 [2101c6a117501285410754275d2333] [9.1.2.2.1.1.2.1.2.1.3.18] WARN 4754 --- [wrappedProductBoundElasticScheduler-7] o.s.b.f.xml.XmlBeanDefinitionReader : Ignored XML validation warning
org.xml.sax.SAXParseException: schema_reference.4: Unable to read schema document 'http://www.springframework.org/schema/beans/spring-beans-2.5.xsd', because 1) the document could not be found; 2) the document could not be read; 3) the document's root element is not <xsd:schema>.
at ...
Caused by: java.net.SocketException: Network unreachable
at ...
... 78 common frames omittedRoot Cause
The core library uses lazy loading; even if the class fails to initialize, the application still registers itself and appears healthy, but any request that touches the faulty class triggers the error, leading to thread blockage.
Remediation Steps
Fail fast: move the core package startup into the main method and abort the application if initialization fails.
Catch Throwable instead of only Exception at the outer layer.
Set reasonable timeouts for subscriber blockGet calls.
Takeaway
Small oversights—missing initialization checks, inadequate exception handling, and unchecked time‑outs—can cascade into large‑scale outages. Proactively validating critical components and handling all throwable errors can prevent similar “air‑disaster” scenarios in software systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
