Debugging Rust Memory Leaks in Frontend Services: Tools, Techniques, and Real-World Fixes
This article walks through a real-world investigation of memory leaks in a Rust-based frontend service, detailing common leak scenarios, profiling tools like tokio-console and jemalloc, load testing with k6, and the step-by-step analysis that uncovered regex misuse and cache bugs, ultimately stabilizing memory usage.
Problem Description
After each deployment of the frontend gray service, memory usage continuously grows until the process runs out of memory (OOM).
Typical Rust Memory Leak Scenarios
Circular references that prevent objects from being released.
Improper use of Box or Rc, which can cause leaks if misused.
Forgetting to release resources such as memory, files, or network connections.
These scenarios guide the initial investigation direction.
Investigation Tools
tokio-console
The service uses the tokio async runtime. tokio-console inspects the runtime and shows the state of asynchronous tasks.
# Install & import initialization code
cargo install --locked tokio-console
# Default port is 6669
tokio-console http://127.0.0.1:5555Running the command displays most tasks as expected, with no obvious async task leaks.
jemalloc Profiling
Since async tasks are not leaking, we observe memory allocation using jemalloc. Two crates are added:
tikv-jemallocator – replaces the default allocator.
jemalloc_pprof – provides functions to export profile files.
Export the profile via an HTTP endpoint and enable profiling with the environment variable:
export _RJEM_MALLOC_CONF=prof:true,lg_prof_interval:28Analyze the profile with jeprof or flamegraph.pl to generate flame graphs.
# Export profile and analyze with jeprof
dump-profile() {
curl http://127.0.0.1:3000/debug/profile/prof > "$1.prof"
jeprof --svg ./target/release/page-server "$1.prof" > "$1.svg"
# Generate flamegraph using flamegraph.pl
jeprof ./target/release/page-server "$1.prof" --collapse | perl flamegraph.pl > "$1.flamegraph.svg"
}
dump-profile 1k6 Load Testing
k6is used to simulate real‑world requests with a JavaScript test script, making it friendly for frontend performance testing.
import http from 'k6/http';
import { sleep } from 'k6';
export const options = {
vus: 10, // concurrent virtual users
duration: '30s',
};
export default function () {
const headers = {
host: `some-project.huolala.cn`,
timeout: '1s',
};
http.get(`http://127.0.0.1:3001/${path}`, { headers });
sleep(1);
}Investigation Steps
Export online profile files for analysis.
Use k6 to reproduce the issue and isolate a minimal demo.
Locate the problematic code and fix the bug.
Findings and Solutions
Regex Misuse in Async Functions
Profile analysis showed high memory usage in a regex match inside an async function. The memory grew when the regex was used together with await, preventing reuse of the compiled regex and allocating new memory each time.
Adding a Drop implementation confirmed that the future’s memory is released when cancelled, but the regex itself still caused growth.
// Allocate 10 MiB buffer
let mut buf: Vec<u8> = Vec::with_capacity(1024 * 1024 * 10);
buf.fill(0);
for caps in HTML_TAG_REGEX.captures_iter(content) {
// ...
}After testing, the memory footprint stabilized at about 8.1 MiB, indicating that cancelling the future frees its memory.
By separating the regex capture from the async await, memory growth stopped:
// Before
for caps in HTML_TAG_REGEX.captures_iter(content) {
async_fn().await;
}
// After
for position in HTML_TAG_REGEX.captures_iter(content).map(extract_position).collect::<Vec<_>>() {
async_fn().await;
}Cache Module Bug
Another memory hotspot was the third‑party rust-s3 library, specifically the hyper::body::to_bytes call. Investigation revealed that the cache module failed to clean up file data, causing continuous memory growth.
After refactoring the cache logic to clear files promptly, memory usage stabilized in production.
Ownership Transfer in Hyper
The hyper::body::to_bytes implementation retains ownership of buffers while awaiting data, which can keep memory alive if not handled correctly.
pub async fn to_bytes<T>(body: T) -> Result<Bytes, T::Error>
where
T: HttpBody,
{
while let Some(buf) = body.data().await {
vec.put(buf?); // memory concentrates here in the profile
}
}Conclusion
Establish a stable reproduction method to verify fixes; analyze monitoring data before debugging.
Reduce business code to its core logic to eliminate interference during investigation.
Memory profile files and flame graphs are essential, but a function appearing in a stack trace does not automatically imply a block.
References
Greptime memory leak diagnosis: https://medium.com/@greptime_team/memory-leak-diagnosing-using-flame-graphs-760c9b05dba7
Tikv developer talk on memory leak debugging: https://mp.weixin.qq.com/s/S8aWbPLRgwtKuhT94j34Pw
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
