Why Did My Java Service Crash? Uncovering a BouncyCastle Memory Leak and Fixing It

The article walks through a real‑world Java service outage caused by CPU saturation, details a systematic five‑step investigation, reveals a memory leak in BouncyCastleProvider objects within JceSecurity, and explains how converting the provider to a static singleton resolved the issue.

Programmer DD
Programmer DD
Programmer DD
Why Did My Java Service Crash? Uncovering a BouncyCastle Memory Leak and Fixing It

1. Problem Discovery

CPU usage of online machines rose from April 8, eventually reaching 100%, causing the service to become unavailable. After a restart the service recovered.

2. Investigation Approach

Possible causes were divided into five categories: system code issues, downstream cascade effects, upstream traffic spikes, third‑party HTTP problems, and host problems.

3. Investigation Steps

Checked logs – no concentrated errors, so code logic was ruled out.

Contacted downstream systems – they were normal.

Compared provider call volume – no spike.

Checked TCP status – normal, ruling out third‑party timeouts.

Monitored six machines – all showed rising CPU, eliminating host failure.

None of these pinpointed the root cause.

4. Solution

Restarted five of the six affected machines to restore service, keeping one for analysis.

Identified the Tomcat process PID (e.g., 384) and inspected its threads.

Found several threads (pid 4430‑4433) each consuming ~40% CPU.

Converted those PIDs to hex (114e‑1151) and dumped the Java stack: sudo -u tomcat jstack -l 384 > /1.txt Discovered that the heavy threads were GC threads.

Dumped the heap:

sudo -u tomcat jmap -dump:live,format=b,file=/dump201612271310.dat 384

Analyzed the heap with Eclipse MAT and saw javax.crypto.JceSecurity objects occupying 95% of memory.

Examined the reference tree and found an excessive number of BouncyCastleProvider instances.

5. Code Analysis

The application creates a new BouncyCastleProvider for every encryption/decryption call and stores it in the static map inside JceSecurity. Because the map is static, the objects are never garbage‑collected, leading to a memory leak.

6. Code Fix

Make the provider a static singleton so each class holds a single instance, preventing repeated allocations.

7. Takeaways

When facing online incidents, follow a systematic checklist: check logs, CPU, TCP, Java threads (jstack), Java heap (jmap), and use a heap analyzer to locate non‑collectable objects.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendJavaJVMmemory leakBouncyCastlegc
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.