Why Async Can Consume More Memory Than Threads: A Multi‑Language Benchmark
This article benchmarks async versus multithreaded programming across Rust, Go, Java, C#, Node.js, Python, and Elixir, measuring peak memory usage for 1, 10 000, 100 000 and 1 000 000 concurrent tasks to reveal surprising scalability trade‑offs.
In this article I explore the memory consumption differences between asynchronous and multithreaded programming, covering popular languages such as Rust, Go, Java, C#, Python, Node.js, and Elixir.
After noticing more than a 20× memory gap between programs handling 10 000 network connections, I created a comprehensive benchmark.
Benchmark
The benchmark launches N concurrent tasks, each waiting 10 seconds, then exits. The task count is controlled via a command‑line argument. All source code is available on my GitHub.
Rust
Three Rust programs were written: a traditional thread version and two async versions (Tokio and async‑std).
let mut handles = Vec::new();
for _ in 0..num_threads {
let handle = thread::spawn(|| {
thread::sleep(Duration::from_secs(10));
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}Async‑Tokio version:
let mut tasks = Vec::new();
for _ in 0..num_tasks {
tasks.push(task::spawn(async {
time::sleep(Duration::from_secs(10)).await;
}));
}
for task in tasks {
task.await.unwrap();
}Go
var wg sync.WaitGroup
for i := 0; i < numRoutines; i++ {
wg.Add(1)
go func() {
defer wg.Done()
time.Sleep(10 * time.Second)
}()
}
wg.Wait()Java
Traditional thread version:
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < numTasks; i++) {
Thread thread = new Thread(() -> {
try {
Thread.sleep(Duration.ofSeconds(10));
} catch (InterruptedException e) {}
});
thread.start();
threads.add(thread);
}
for (Thread thread : threads) {
thread.join();
}Virtual‑thread (preview) version, almost identical:
List<Thread> threads = new ArrayList<>();
for (int i = 0; i < numTasks; i++) {
Thread thread = Thread.startVirtualThread(() -> {
try {
Thread.sleep(Duration.ofSeconds(10));
} catch (InterruptedException e) {}
});
threads.add(thread);
}
for (Thread thread : threads) {
thread.join();
}C#
List<Task> tasks = new List<Task>();
for (int i = 0; i < numTasks; i++) {
Task task = Task.Run(async () => {
await Task.Delay(TimeSpan.FromSeconds(10));
});
tasks.Add(task);
}
await Task.WhenAll(tasks);Node.js
const delay = util.promisify(setTimeout);
const tasks = [];
for (let i = 0; i < numTasks; i++) {
tasks.push(delay(10000));
}
await Promise.all(tasks);Python
async def perform_task():
await asyncio.sleep(10)
tasks = []
for _ in range(num_tasks):
task = asyncio.create_task(perform_task())
tasks.append(task)
await asyncio.gather(*tasks)Elixir
tasks =
for _ <- 1..num_tasks do
Task.async(fn ->
:timer.sleep(10000)
end)
end
Task.await_many(tasks, :infinity)Test Environment
Hardware: Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz
OS: Ubuntu 22.04 LTS
Rust 1.69
Go 1.18.1
Java OpenJDK 21‑ea
.NET 6.0.116
Node.js v12.22.9
Python 3.10.6
Elixir 1.12.2 (Erlang/OTP 24)
All programs were run in release mode; other options remained default.
Results
Minimal Memory Usage
With a single task, Go and Rust (statically compiled) use the least memory, while managed runtimes consume more. .NET shows the worst baseline, though the gap can be tuned.
Figure 1: Peak memory for one task.
10 000 Tasks
Figure 2: Peak memory for 10 000 tasks.
Threads were expected to lose, but Java threads consumed ~250 MB, while native Rust threads stayed below many runtimes’ idle memory. Go’s goroutines used ~50 % more memory than Rust threads, contrary to expectations.
100 000 Tasks
Figure 3: Peak memory for 100 000 tasks.
Thread‑based benchmarks could not run 100 000 native threads, so they were omitted. Go fell behind Rust, Java, C#, and Node.js, and .NET’s memory remained almost unchanged, likely due to pre‑allocation.
1 000 000 Tasks
Figure 4: Peak memory for 1 000 000 tasks.
C#’s memory grew but stayed competitive, even beating one Rust runtime. Go’s memory usage exploded, becoming over 12× the best performer and more than twice Java’s, contradicting the common belief that Go is lightweight. Rust’s Tokio remained the most memory‑efficient at this scale.
Final Thoughts
Massive concurrency can consume large amounts of memory even without heavy computation. Different runtimes have distinct trade‑offs: some excel with few tasks, others scale better under massive load. Not all runtimes handle extreme concurrency out‑of‑the‑box.
This comparison focuses solely on memory; task start‑up time and communication speed are also important. At one million tasks, most programs took over 12 seconds to finish. Future benchmarks will examine those aspects.
Comments from the community provide additional insights and tips, such as avoiding unnecessary Task.Run(...) in C#, understanding Go’s stack allocation, and tuning Erlang/Elixir process limits.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
