Fundamentals 27 min read

In‑Depth Analysis of dlmalloc, jemalloc, Scudo, and PartitionAlloc for Virtual‑Machine Memory Management

This article examines the design goals, key implementation details, strengths and weaknesses of four widely used memory allocators—dlmalloc, jemalloc, Scudo, and PartitionAlloc—highlighting how they address fragmentation, performance, and security in virtual‑machine runtimes and offering guidance for building efficient, safe allocators.

ByteDance Web Infra

Aug 19, 2022

In‑Depth Analysis of dlmalloc, jemalloc, Scudo, and PartitionAlloc for Virtual‑Machine Memory Management

Memory is a core resource in modern computer architectures, and a virtual machine (VM) must manage it efficiently through allocation, garbage collection, and reclamation. This article explores the design philosophy behind common memory allocators and provides guidance for implementing high‑performance VM memory management.

1. dlmalloc

Key Points

Obtains large memory segments from the OS via sbrk or mmap, linking them in a list of Segments.

Divides each Segment into chunks; small chunks are managed by small bins (simple doubly‑linked lists) and larger chunks by tree bins (size‑based tries).

Allocation strategy varies for small (<0x100 bytes), large, and huge (>64 KB) requests.

Weaknesses

Not thread‑safe; Android’s Bionic has replaced it with more modern heap implementations.

Buddy allocation reduces external fragmentation but incurs heavy internal fragmentation.

2. jemalloc (Android 5.0+)

Designed to improve multithreaded performance and reduce fragmentation.

Key Points

Allocates a large Chunk (typically 512 KB) via mmap, split into a header and data region.

Data region is divided into Run s, each managed by a bin (bucket) for a specific size class.

Each thread has a tcache (thread‑local free list) to avoid contention; arenas are locked only when necessary.

Three allocation paths: Small Object (from Run), Large Object (single Run), Huge Object (>4 MB, direct mmap).

Summary

jemalloc’s core is a bin allocator; each arena has logical bins for size classes, minimizing fragmentation.

Per‑arena locks prevent lock contention across arenas.

Thread‑specific caches further reduce data races.

3. Scudo (Android 11+)

Scudo (named after “escudo”, meaning shield) is the default allocator for native code on Android 11, focusing on security.

Key Points

Scudo defines Primary and Secondary allocators. Requests < 64 KB use the Primary allocator; larger requests use the Secondary allocator.

struct AndroidConfig {
  using SizeClassMap = AndroidSizeClassMap;
#if SCUDO_CAN_USE_PRIMARY64
  // 256 MB regions
  typedef SizeClassAllocator64<SizeClassMap, 28U, 1000, 1000, true> Primary;
#else
  // 256 KB regions
  typedef SizeClassAllocator32<SizeClassMap, 18U, 1000, 1000> Primary;
#endif
  typedef MapAllocator<MapAllocatorCache<32U, 2UL<<20, 0, 1000>> > Secondary;
  template<class A>
  using TSDRegistryT = TSDRegistrySharedT<A, 2U>; // max 2 TSDs
};

The Primary allocator reserves a virtual address space, splits it into size classes, and uses random offsets to mitigate address‑space attacks. Each allocated chunk carries a header with class ID, state, origin, size, and a checksum for integrity verification.

NOINLINE void* allocate(uptr Size, Chunk::Origin Origin,
                         uptr Alignment = MinAlignment,
                         bool ZeroContents = false) {
  initThreadMaybe();
  // ... fast path using thread‑local cache ...
  if (PrimaryT::canAllocate(NeededSize)) {
    // allocate from Primary
  } else {
    // fallback to Secondary
  }
  // Align user pointer, set chunk header, compute checksum
  return TaggedPtr;
}

Scudo also employs a Thread‑Local Cache (TSD) to accelerate multithreaded allocations, and a secondary allocator that adds guard pages around large blocks.

Summary

Scudo improves security through randomization, guard pages, and checksum‑based integrity checks, at the cost of some memory overhead and reduced allocation speed compared to jemalloc.

4. PartitionAlloc

Chromium’s cross‑platform allocator, optimized for client‑side workloads and security.

Key Points

Memory is divided into Super Page s (2 MB) which are further split into Slot Span s and Partition Page s.

Each bucket manages a collection of slot spans; small allocations are served from per‑thread caches, larger ones from a shared free‑list, and huge allocations via direct mapping.

Guard pages protect the first and last partition pages; metadata is stored separately from objects.

Four main partitions in Chromium: Buffer, Node, LayoutObject, and FastMalloc, each with tailored allocation strategies.

Security Features

Linear overflows are caught by guard pages.

Metadata resides in dedicated regions, preventing corruption of object data.

Large allocations have guard pages at both ends.

Buckets isolate size classes, reducing cross‑size attacks.

Summary

PartitionAlloc offers a small code footprint, high allocation efficiency, and strong security guarantees through address‑space isolation and guard pages, making it suitable for performance‑critical, security‑sensitive applications like Chromium.

5. Overview

Allocator performance and space efficiency depend on trade‑offs among caching, allocation strategies, and security mechanisms. The table below summarizes the main characteristics of each allocator.

Allocator

Notes

dlmalloc

Non‑thread‑safe, high fragmentation, low performance, low security.

jemalloc

Large code size, high memory usage, good performance, moderate security.

scudo

Security‑focused, moderate performance, low memory efficiency due to metadata and randomization.

partition‑alloc

Small code size, high performance, strong security via isolation and guard pages, cross‑platform.

Benchmark results show varying QPS and RSS footprints across allocators, illustrating the trade‑offs between speed and memory consumption.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

malloc jemalloc memory allocation Scudo dlmalloc partitionalloc

Written by

ByteDance Web Infra

ByteDance Web Infra team, focused on delivering excellent technical solutions, building an open tech ecosystem, and advancing front-end technology within the company and the industry | The best way to predict the future is to create it

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.