How a Faulty Lazy-Loading Design Caused Thread‑Pool Exhaustion and How to Fix It

A production incident where a poorly implemented lazy‑loading mechanism for KMSClient caused repeated initialization, blocking threads, exhausting the shared thread pool, and triggering RejectedExecutionException alerts, was investigated step‑by‑step, leading to a concrete code fix, improved monitoring, and better thread‑pool isolation.

Huolala Tech
Huolala Tech
Huolala Tech
How a Faulty Lazy-Loading Design Caused Thread‑Pool Exhaustion and How to Fix It

Scenario Description

Time : 08:24 on a certain morning

Symptom : Massive RejectedExecutionException alerts on node 172.xx.84.113

Recent Operations : No code changes, but service pods were restarted due to resource eviction

Root Cause Summary

Investigation revealed that the problem originated from the lazy‑loading implementation of KMSClient in EncryptUtil. The code attempted to use a StringBuffer flag ( is_init) together with synchronized to ensure a single initialization, but the design was flawed under concurrency.

Problem Code

@Slf4j
public class EncryptUtil {
    // ① lazy‑load flag
    private static final StringBuffer is_init = new StringBuffer("");

    /** data encryption */
    public static String encrypt(String plaintext) {
        try {
            if (StringUtils.isEmpty(plaintext)) {
                return null;
            }
            // ② check lazy‑load flag
            if (!"1".equals(is_init.toString())) { // 1 initialization entry
                init();
            }
            return Util.encryptDataForVersion(plaintext, "logic_sharding", "v1");
        } catch (Exception e) {
            log.error("数据加密失败", e);
            return null;
        }
    }

    /** data decryption */
    public static String decrypt(String cipherText) { ... }

    /** client initialization */
    private static void init() {
        // ③ synchronized block
        synchronized (EncryptUtil.class) {
            KMSClient.initSecurity(Arrays.asList("logic_sharding"));
            is_init.append("1");
        }
    }
}

The is_init flag starts as an empty string. When multiple threads call EncryptUtil.encrypt concurrently, each sees the flag empty, passes the if check, and attempts to run init(). Only one thread acquires the monitor lock; the others block. After the first thread finishes, it appends "1" to the flag, but subsequent threads still see the flag as "111" (multiple appends), causing every request to re‑enter the synchronized block and re‑initialize the heavy KMSClient (2‑3 s network call). This creates a cascade of blocked threads, quickly exhausting the shared thread pool.

Evidence

Spike in RejectedExecutionException (AbortPolicy) on a single pod.

Corresponding rise in threads in BLOCKED state, indicating monitor lock contention.

Code sections using synchronized were inspected; the culprit was the EncryptUtil lazy‑load block.

Further analysis showed that every business flow involving encryption (rule fetching, order receipt, order update, etc.) suffered the 2‑3 s delay caused by repeated KMSClient initialization.

Timeline showed the issue started when a new pod was launched after host eviction; early requests triggered the concurrent initialization.

Proof of Concept

High concurrency tests confirmed that repeatedly initializing KMSClient quickly fills the thread queue, reproducing the production outage.

Improvements

Code Fix

Replace the StringBuffer flag with an AtomicBoolean and add double‑checked locking.
@Slf4j
public class EncryptUtil {
    /** initialization flag */
    private static final AtomicBoolean is_init = new AtomicBoolean(false);

    /** data encryption */
    public static String encrypt(String plaintext) {
        try {
            if (StringUtils.isEmpty(plaintext)) {
                return null;
            }
            // initialize client if not done yet
            init();
            return AESUtil.encryptDataForVersion(plaintext, "logic_sharding", "v1");
        } catch (Exception e) {
            log.error("数据加密失败", e);
            return null;
        }
    }

    /** data decryption */
    public static String decrypt(String cipherText) { ... }

    /** client initialization */
    private static void init() {
        if (is_init.get()) { return; }
        synchronized (EncryptUtil.class) {
            if (is_init.get()) { return; }
            log.info("加解密客户端初始化 begin");
            KMSClient.initSecurity(Arrays.asList("logic_sharding"));
            log.info("加解密客户端初始化 end");
            is_init.set(true);
        }
    }
}

Monitoring & Alerts

Enable dynamic thread‑pool alerts and configure alert recipients to catch queue‑full situations early.

Incident Response Process

Three‑stage approach: fast detection, precise定位, and stable recovery. Use fine‑grained monitoring, phone/instant‑messaging alerts, and a dedicated on‑call rotation.

Thread‑Pool Isolation

Separate thread pools per business scenario (add, modify, cancel, fulfil) to prevent a single scenario from exhausting the global pool.

RejectedExecutionException spike
RejectedExecutionException spike
Blocked threads
Blocked threads
Thread pool usage
Thread pool usage
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Javaperformanceconcurrencythread poollazy loadingbackend debuggingKMS client
Huolala Tech
Written by

Huolala Tech

Technology reshapes logistics

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.