Operations 9 min read

Boost Cloud Application Speed by 36% Using Baidu’s Btune Performance Diagnostic Tool

After migrating workloads to a new CPU platform, unexpected performance regressions can occur, but Baidu Cloud's Btune tool provides automated, multi‑dimensional analysis and actionable optimization suggestions that helped a test program improve its execution time by 36.8% through memory and NUMA tuning.

Baidu Geek Talk
Baidu Geek Talk
Baidu Geek Talk
Boost Cloud Application Speed by 36% Using Baidu’s Btune Performance Diagnostic Tool

Background

Developers often encounter surprising performance drops after moving services to newer CPU platforms or launching new workloads, with latency sometimes doubling despite the hardware upgrade.

Btune Overview

Btune, Baidu Intelligent Cloud's Application Performance Diagnostic Tool, offers one‑click performance tuning for cloud workloads. It leverages Baidu’s extensive experience across Intel, AMD, and ARM CPUs and various business scenarios (recommendation, search, advertising, big data, databases, video transcoding) to automatically identify bottlenecks and generate optimization recommendations.

Analysis Dimensions

CPU, memory, disk, network, and concurrency are examined.

Analysis spans application, runtime, system, and hardware layers.

Test Case Description

A simple C program repeatedly calls memset and memcpy on large arrays, then runs under numactl to simulate cross‑NUMA memory access. The source code is:

#include "stdio.h"
#include "stdlib.h"
#include "string.h"

#define ARRAY_SIZE 1000000000

int main() {
    int i = 0;
    int *a = malloc(sizeof(int) * ARRAY_SIZE);
    int *b = malloc(sizeof(int) * ARRAY_SIZE);
    while (1) {
        memset(a, 0, sizeof(int) * ARRAY_SIZE);
        memset(b, 0, sizeof(int) * ARRAY_SIZE);
        memcpy(b, a, sizeof(int) * ARRAY_SIZE);
    }
    return 0;
}

Step‑by‑Step Usage of Btune

Log in to the Baidu Cloud console and create a cloud server instance.

Upload and start the test program on the instance.

Open the “Self‑service Diagnosis” tool, select “Performance Detection”, choose the server and the test process, and start data collection.

After a few minutes, view the analysis summary report, which lists bottlenecks and optimization suggestions.

Open the detailed report to explore CPU, memory, network, disk, and concurrency metrics, as well as hotspot functions and flame graphs.

Analysis Findings

The summary report identified three key suggestions:

Upgrade the glibc library (hotspot functions memset and memcpy benefit from glibc 2.33).

Reduce cross‑NUMA memory accesses (current usage is 100%).

Detailed diagnostics showed:

CPU: No issues with kernel networking, storage, or scheduling; primary risk lies in glibc hotspot functions.

Memory: No leaks, uses anonymous huge pages, but high cross‑NUMA usage.

Concurrency: Single thread, no split‑lock or context‑switch problems.

Optimization Results

Applying the first suggestion (disable cross‑NUMA, keep glibc 2.17) reduced execution time from 2.576 s to 1.821 s (29.3% improvement). Applying both suggestions (upgrade to glibc 2.33 and disable cross‑NUMA) further reduced time to 1.626 s, a total gain of 36.8%.

Conclusion

Btune enables even junior operations engineers to perform high‑level performance tuning by automatically locating bottlenecks across multiple dimensions and providing concrete, actionable recommendations, delivering significant speedups for cloud applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance optimizationcloud computingCPUbenchmarkMemoryNUMABtune
Baidu Geek Talk
Written by

Baidu Geek Talk

Follow us to discover more Baidu tech insights.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.