Using async-profiler to Optimize CPU Usage in a Dynamic QPS Test Case
The article details how the author used async-profiler to analyze a Java dynamic QPS test case, identified a CPU hotspot in a time‑checking method, replaced it with a timestamp check, and achieved a modest 0.1% reduction in overall CPU usage, illustrated with flame‑graph images and code snippets.
I have long heard about the powerful capabilities of CPU flame graphs and various flame‑graph tools, and today I finally started trying a CPU flame‑graph generation tool.
Unfortunately, the flame‑graph plugin bundled with IntelliJ could not be used for various reasons, so I turned to the async‑profiler analysis tool as a replacement.
While testing random‑number performance, I used a dynamic QPS model case, learned how to use async‑profiler, and unexpectedly discovered an area for performance optimization that reduced CPU usage by 0.1%, marking my first result.
async-profiler
The installation and usage guide for this tool can be found online; I recommend checking the GitHub repository Wiki for details.
Case code
Below is the case code that uses a dynamic QPS model.
class
T
extends
SourceCode
{
static
void
main(String[] args) {
def
total =
1000
_0000
def
index =
new
AtomicInteger()
int
i =
0
def
test = {
i++ % total
// index.getAndIncrement() % total
getRandomInt(total)
sleep(
0.01
)
}
new
FunQpsConcurrent(test,
"测试随机性能"
).start()
}
}The method that executes the task is com.okcoin.hickwall.presses.funtester.frame.execute.FunQpsConcurrent#start and its code is shown below:
void
start
()
{
if
(executor ==
null
) executor = ThreadPoolUtil.createCachePool(Constant.THREADPOOL_MAX,
"Q"
)
if
(Common.PERF_PLATFORM) controller =
new
RedisController(
this
)
if
(controller ==
null
) controller =
new
FunTester();
new
Thread(controller,
"receiver"
).start();
while
(key) {
ThreadPoolUtil.executeTask(executor, qps, produce, total, name)
}
stop()
}Optimization Process
The entire main thread spends most of its time in the while loop. I first generated a flame graph of the main thread, shown below:
From the pre‑optimization flame graph, the method com.okcoin.hickwall.presses.funtester.frame.execute.ThreadPoolUtil#executeTask consumes 0.53% CPU, while the getSecond method uses the most CPU because it creates a Calendar object. The relevant code is:
if
(Time.getSecond() % COUNT_INTERVAL ==
0
) {
int
real = total.sumThenReset() / COUNT_INTERVAL as
int
def active = executor.getActiveCount()
def count = active ==
0
?
1
: active
log.info(
"{} design QPS:{},actual QPS:{} active thread:{} per thread efficiency:{}"
, name, qps, real, active, real / count as
int
)
}The original intention was to output design QPS, actual QPS, and active thread count every few seconds. I suspected that using a timestamp check would be faster, so I replaced the code as follows:
if
(SourceCode.getMark() % COUNT_INTERVAL ==
0
) {
int
real = total.sumThenReset() / COUNT_INTERVAL as
int
def active = executor.getActiveCount()
def count = active ==
0
?
1
: active
log.info(
"{} design QPS:{},actual QPS:{} active thread:{} per thread efficiency:{}"
, name, qps, real, active, real / count as
int
)
}After rebuilding and running the test, I captured another flame graph for the main thread, shown below:
Post‑optimization, the CPU usage of com.okcoin.hickwall.presses.funtester.frame.execute.ThreadPoolUtil#executeTask dropped to 0.29%, while the com.okcoin.hickwall.presses.funtester.frame.execute.FunQpsConcurrent#start method now consumes 0.44% CPU, a reduction of 0.09% compared with the original 0.53%.
Rounded, the overall improvement is about 0.1%, which I consider a successful optimization. I also noticed that most of the remaining CPU time is spent in the sleep method, suggesting that the earlier conclusions about random‑number performance may need revisiting.
FunTester原创专题推荐~ 接口功能测试专题 性能测试专题 Groovy专题 Java、Groovy、Go、Python 单测&白盒 FunTester社群风采 测试理论鸡汤 FunTester视频专题 案例分享:方案、BUG、爬虫 UI自动化专题 测试工具专题 -- By FunTester
FunTester
10k followers, 1k articles | completely useless
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.