How Adaptive K‑Value Backoff Locks Boost RocketMQ Performance by Up to 38%

A recent CCF‑A conference paper reveals that an adaptive K‑value backoff lock, derived from queueing theory and implemented in Apache RocketMQ, can replace both spin and mutex locks, achieving up to 37.58% performance gains on x86 CPUs and 32.82% on ARM while reducing CPU usage and resource consumption.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Adaptive K‑Value Backoff Locks Boost RocketMQ Performance by Up to 38%

Paper Overview

A paper titled Beyond the Bottleneck: Enhancing High‑Concurrency Systems with Lock Tuning was accepted to the CCF‑A level FM 2024 conference. The authors (Ji Juntao, Gu Yinyou, Fu Yubao, Lin Qingshan) present a lock‑tuning technique originally motivated by performance optimization of RocketMQ on Alibaba Cloud CPUs.

Problem Statement

RocketMQ historically employed two types of locks during message sending: a spin lock and a mutex lock. Different CPUs exhibit distinct optimal lock behaviours; a mismatched lock can cause severe performance degradation and unnecessary resource consumption.

Proposed Adaptive K‑Value Backoff Lock

The authors model spin‑lock behaviour using queueing theory, establishing a relationship between the spin count K and system load P. The expected lock acquisition time consists of two components: T_s: expected time spent spinning T_c: expected time spent in a context switch

By substituting the expressions for T_s and T_c as functions of K and P, they derive a formula for the overall expected lock time (image shown below).

The adaptive lock works as follows: after K spin attempts without acquiring the lock, the thread invokes Thread.yield(), handing the CPU back to the operating system. This strategy avoids wasteful spinning in low‑contention scenarios and eliminates unnecessary context switches in high‑contention cases.

Experimental Evaluation

Tests were conducted on both x86 and ARM CPUs using Apache RocketMQ with synchronous disk flushing. The key findings include:

When K = 10^3, the system reaches its peak throughput (TPS) of 155,019.20 on x86, while CPU utilization drops to its minimum.

Performance improvements of 37.58% on x86 and 32.82% on ARM were observed compared with the original lock implementation.

CPU usage decreased from over 1000% to around 750% at the optimal K value, indicating significant resource savings.

Additional measurements of broker resource consumption showed that the K value yielding maximum TPS also corresponded to the lowest CPU usage.

Conclusion

The adaptive K‑value backoff lock provides a single, self‑tuning lock that achieves optimal performance across varying contention levels, reduces CPU waste, and simplifies deployment for high‑concurrency systems such as RocketMQ. The approach is validated on multiple CPU architectures and I/O strategies, demonstrating its broad applicability.

Paper Details

Title: Beyond the Bottleneck: Enhancing High‑Concurrency Systems with Lock Tuning

Authors: Ji Juntao, Gu Yinyou, Fu Yubao, Lin Qingshan

Abstract: High‑concurrency systems often hit performance bottlenecks due to intense lock contention, leading to waiting and costly context switches. By refining a lightweight spin lock and introducing a concise parameter‑tuning strategy, the authors achieve up to 37.58% (x86) and 32.82% (ARM) throughput gains in Apache RocketMQ, while maintaining low resource overhead across code versions and I/O flush modes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance tuninghigh concurrencyRocketMQlock optimizationbackend systemsqueueing theory
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.