Backend Development 13 min read

Improving Service Startup Latency with a Warmup Mechanism for Dubbo Services

To eliminate the brief latency spike that occurs when new Dubbo services start, the authors introduce a @Warmup annotation and early‑startup listeners that generate mock traffic and trigger a forced GC before publishing, cutting initial response times from seconds to tens of milliseconds and improving reliability for latency‑sensitive applications.

Didi Tech

Jul 27, 2023

Improving Service Startup Latency with a Warmup Mechanism for Dubbo Services

When a new version of a service is released, the interfaces often experience a spike in latency for a short period after startup, as shown in the chart where the newly deployed machines have higher response times.

This latency burst can be critical in sensitive scenarios such as real‑time fraud detection. Attackers may exploit the brief window of increased timeout probability to launch malicious actions, for example by targeting a marketing coupon‑grab activity that coincides with a service deployment.

The root cause is that the service is still initializing resources (JIT compilation, connection pools, caches, etc.) during the warm‑up phase, causing the downstream gateway to receive delayed or timed‑out responses.

Problem Analysis

The latency jitter after startup is usually attributed to class loading, object creation, GC, connection‑pool initialization, cache loading, JIT compilation, etc. Many components use lazy loading, which defers resource allocation until traffic arrives. While the JVM performs optimizations (C2 compilation, dynamic memory adjustments), these actions consume CPU and may cause pauses.

The existing Dubbo framework (v2.7.9) provides a warm‑up mechanism based on real traffic weight. The logic is:

Consumer reads the provider’s weight (default 100) from the URL.

Consumer reads the provider’s start‑up timestamp and calculates uptime in milliseconds.

Consumer reads the configured warm‑up period (default 10 minutes).

Weight is computed as (uptime / warmupPeriod) * 100, capped at the configured weight.

This linear increase works for most backend services, but for latency‑sensitive scenarios like fraud detection, the initial “slow” period can cause timeouts and financial loss.

Solution Idea

Instead of relying on real traffic, we can mock a load before exposing the Dubbo port, eliminating the impact on genuine requests.

Provide a unified annotation and listener in a common component to keep intrusion low.

Configure warm‑up intensity and duration via annotation parameters, making it a reusable capability for all micro‑services.

Implementation Details

1. Introduce a new annotation @Warmup in the internal component april that can be placed on interface methods. Attributes: concurrent: number of parallel threads (default 1). targetLatency: desired maximum latency in ms; warm‑up stops when reached. maxCounts: maximum warm‑up executions (default 100). mockArgs: arguments used for mock calls (should be test data). clusters: target cluster environment; if set, warm‑up runs only in that cluster.

2. Add a Spring WarmupApplicationListener with priority LOWEST_PRECEDENCE - 2, which runs before Dubbo’s DubboBootstrapApplicationListener, ensuring warm‑up occurs before the service is published.

3. Execution flow:

During bean post‑processing, a custom PeacefulTargetPostProcessor scans for @Warmup methods, wraps them into WarmTarget objects, and stores them in a list.

After the Spring context is refreshed, WarmupApplicationListener iterates the list, creates a temporary thread pool based on concurrent, and invokes the target methods via reflection using mockArgs.

The loop stops when either targetLatency or maxCounts is reached.

After all warm‑up tasks finish, the temporary thread pool is shut down, then Dubbo’s startup listener publishes the service.

4. To mitigate GC overhead from the warm‑up objects, a forced full GC ( System.gc()) is triggered after warm‑up but before exposing Dubbo.

Effect Verification

Using a data‑feature service as a test case, the average response time dropped from ~2.7 s (no warm‑up) to ~330 ms after warm‑up, and further to ~180 ms after the forced GC.

Additional optimization involved warming up the Dubbo SPI and filter chain itself. A new listener DubboWarmupListener repeatedly calls a mock service before Dubbo is exposed, enabled via @EnableDubboWarmup. This reduced the startup latency to ~36 ms.

While still higher than the steady‑state 10 ms, 36 ms is acceptable for most business scenarios.

Further Thoughts

Observations show that warm‑up can trigger Young GC spikes due to the temporary objects created. Adding a forced full GC helps, but deeper investigation into JVM tuning and pre‑loading of Dubbo’s internal pipelines may yield further gains.

The warm‑up component has been added to the public development library of the Didi Transaction Security team, and contributors are invited to discuss and help open‑source it.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Performance Optimization Dubbo Backend Performance Preheating Service Warmup

Written by

Didi Tech

Official Didi technology account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.