Why Your Hystrix Semaphore Limits Fail: Hidden JDK GC Bug and Correct Rate Calculations

This article explains how miscalculating Hystrix semaphore quotas, combined with a JDK GCLocker‑initiated GC bug, can cause unexpected request rejections, and provides a proper method for computing concurrency limits and alternative buffering strategies such as Java semaphores and thread pools.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
Why Your Hystrix Semaphore Limits Fail: Hidden JDK GC Bug and Correct Rate Calculations

Problem

We previously used Hystrix semaphore‑based rate limiting, setting quotas based on peak QPS (1000) and average response time (15 ms). The calculation assumed 1000 ms/15 ms ≈ 66 requests per second per semaphore, so 15 semaphores seemed sufficient for 1000 QPS.

However, error logs showed HystrixRuntimeException ... REJECTED_SEMAPHORE_EXECUTION even after increasing the semaphore count to 50, indicating an unresolved issue.

Investigation Steps

Check average request latency (≈ 17 ms) to rule out blocking.

Inspect Hystrix code to ensure semaphores are released; a leak would gradually exhaust permits.

Write a small demo to reproduce the issue; the problem only appears during early service startup before initialization completes.

JDK Bug Discovery

GC logs revealed two rapid YGC events separated by 0.000187 s, indicating a long‑lasting GC pause (~160 ms). The logs marked the pause as GCLocker Initiated GC , which occurs when JNI code manipulates objects and the JVM blocks threads to prevent pointer shifts.

The pause was caused by JDK bug JDK-8048556 , a known issue in the JDK version we were using.

Correct Rate‑Limiting Calculation

The proper formula is:

Concurrency (permits) / average request time (s) > QPS (requests/s)

Considering a 160 ms GC pause, the effective service time per second drops to 840 ms, so the original semaphore count was still theoretically sufficient, but the pause caused request bursts that exceeded available permits.

Buffering Strategies

Two main approaches can handle bursts during GC:

Use java.util.concurrent.Semaphore with tryAcquire() (non‑blocking) or acquire() (blocking) to control concurrency more precisely.

Employ a thread‑pool with a larger maximumPoolSize or a BlockingQueue, optionally adding a rejectHandler for overflow.

Thread pools offer more flexibility but introduce context‑switch overhead and slower scaling during spikes.

Conclusion

We resolved a long‑standing hidden issue by recognizing the impact of GC pauses on semaphore‑based rate limiting, adjusting the semaphore count to at least 95, and understanding that fixing the JDK bug alone would not eliminate the need for proper concurrency sizing.

@Override
public boolean tryAcquire() {
    int currentCount = count.incrementAndGet();
    if (currentCount > numberOfPermits.get()) {
        count.decrementAndGet();
        return false;
    } else {
        return true;
    }
}
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendJavasemaphorerate limitinggcHystrixJDK bug
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.