Fundamentals 14 min read

Why Async Can Consume More Memory Than Threads: A Multi‑Language Benchmark

This article benchmarks async versus multithreaded programming across Rust, Go, Java, C#, Node.js, Python, and Elixir, measuring peak memory usage for 1, 10 000, 100 000 and 1 000 000 concurrent tasks to reveal surprising scalability trade‑offs.

21CTO
21CTO
21CTO
Why Async Can Consume More Memory Than Threads: A Multi‑Language Benchmark

In this article I explore the memory consumption differences between asynchronous and multithreaded programming, covering popular languages such as Rust, Go, Java, C#, Python, Node.js, and Elixir.

After noticing more than a 20× memory gap between programs handling 10 000 network connections, I created a comprehensive benchmark.

Benchmark

The benchmark launches N concurrent tasks, each waiting 10 seconds, then exits. The task count is controlled via a command‑line argument. All source code is available on my GitHub.

Rust

Three Rust programs were written: a traditional thread version and two async versions (Tokio and async‑std).

let mut handles = Vec::new();
for _ in 0..num_threads {
    let handle = thread::spawn(|| {
        thread::sleep(Duration::from_secs(10));
    });
    handles.push(handle);
}
for handle in handles {
    handle.join().unwrap();
}

Async‑Tokio version:

let mut tasks = Vec::new();
for _ in 0..num_tasks {
    tasks.push(task::spawn(async {
        time::sleep(Duration::from_secs(10)).await;
    }));
}
for task in tasks {
    task.await.unwrap();
}

Go

var wg sync.WaitGroup
for i := 0; i < numRoutines; i++ {
    wg.Add(1)
    go func() {
        defer wg.Done()
        time.Sleep(10 * time.Second)
    }()
}
wg.Wait()

Java

Traditional thread version:

List<Thread> threads = new ArrayList<>();
for (int i = 0; i < numTasks; i++) {
    Thread thread = new Thread(() -> {
        try {
            Thread.sleep(Duration.ofSeconds(10));
        } catch (InterruptedException e) {}
    });
    thread.start();
    threads.add(thread);
}
for (Thread thread : threads) {
    thread.join();
}

Virtual‑thread (preview) version, almost identical:

List<Thread> threads = new ArrayList<>();
for (int i = 0; i < numTasks; i++) {
    Thread thread = Thread.startVirtualThread(() -> {
        try {
            Thread.sleep(Duration.ofSeconds(10));
        } catch (InterruptedException e) {}
    });
    threads.add(thread);
}
for (Thread thread : threads) {
    thread.join();
}

C#

List<Task> tasks = new List<Task>();
for (int i = 0; i < numTasks; i++) {
    Task task = Task.Run(async () => {
        await Task.Delay(TimeSpan.FromSeconds(10));
    });
    tasks.Add(task);
}
await Task.WhenAll(tasks);

Node.js

const delay = util.promisify(setTimeout);
const tasks = [];
for (let i = 0; i < numTasks; i++) {
    tasks.push(delay(10000));
}
await Promise.all(tasks);

Python

async def perform_task():
    await asyncio.sleep(10)

tasks = []
for _ in range(num_tasks):
    task = asyncio.create_task(perform_task())
    tasks.append(task)
await asyncio.gather(*tasks)

Elixir

tasks =
    for _ <- 1..num_tasks do
        Task.async(fn ->
            :timer.sleep(10000)
        end)
    end
Task.await_many(tasks, :infinity)

Test Environment

Hardware: Intel(R) Xeon(R) CPU E3-1505M v6 @ 3.00GHz

OS: Ubuntu 22.04 LTS

Rust 1.69

Go 1.18.1

Java OpenJDK 21‑ea

.NET 6.0.116

Node.js v12.22.9

Python 3.10.6

Elixir 1.12.2 (Erlang/OTP 24)

All programs were run in release mode; other options remained default.

Results

Minimal Memory Usage

With a single task, Go and Rust (statically compiled) use the least memory, while managed runtimes consume more. .NET shows the worst baseline, though the gap can be tuned.

Figure 1: Peak memory for one task.

10 000 Tasks

Figure 2: Peak memory for 10 000 tasks.

Threads were expected to lose, but Java threads consumed ~250 MB, while native Rust threads stayed below many runtimes’ idle memory. Go’s goroutines used ~50 % more memory than Rust threads, contrary to expectations.

100 000 Tasks

Figure 3: Peak memory for 100 000 tasks.

Thread‑based benchmarks could not run 100 000 native threads, so they were omitted. Go fell behind Rust, Java, C#, and Node.js, and .NET’s memory remained almost unchanged, likely due to pre‑allocation.

1 000 000 Tasks

Figure 4: Peak memory for 1 000 000 tasks.

C#’s memory grew but stayed competitive, even beating one Rust runtime. Go’s memory usage exploded, becoming over 12× the best performer and more than twice Java’s, contradicting the common belief that Go is lightweight. Rust’s Tokio remained the most memory‑efficient at this scale.

Final Thoughts

Massive concurrency can consume large amounts of memory even without heavy computation. Different runtimes have distinct trade‑offs: some excel with few tasks, others scale better under massive load. Not all runtimes handle extreme concurrency out‑of‑the‑box.

This comparison focuses solely on memory; task start‑up time and communication speed are also important. At one million tasks, most programs took over 12 seconds to finish. Future benchmarks will examine those aspects.

Comments from the community provide additional insights and tips, such as avoiding unnecessary Task.Run(...) in C#, understanding Go’s stack allocation, and tuning Erlang/Elixir process limits.
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

multithreadingBenchmarkAsyncMemory Usage
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.