Mobile Development 11 min read

Scudo’s Evolution in Android 15: Optimizing Speed, Safety, and Memory Efficiency

The article examines how Android’s Scudo memory allocator has been refined over the past three years, detailing its four design goals, the cache and thread‑specific data mechanisms for fast allocation, strategies to balance safe memory release with performance, and the group‑based approach that reduces fragmentation while preserving security.

Linux Code Review Hub

Jun 15, 2024

Scudo’s Evolution in Android 15: Optimizing Speed, Safety, and Memory Efficiency

Three and a half years ago I wrote an introduction to Scudo when it was first introduced in Android 11. Since then many manufacturers still avoid using it.

Allocator Goals

A good memory allocator should be invisible to applications, which translates into three classic goals: fast allocation, fast deallocation, and efficient memory usage. Scudo adds a fourth goal: safe memory access, meaning detection of illegal actions such as use‑after‑free or double‑free, and increasing address randomness to thwart attackers.

The safety goal imposes overhead that hurts performance and memory efficiency. For example, Scudo shuffles memory on allocation, which is slower than sequential allocation. This tension—system vendors prioritizing safety while app developers prioritize speed and memory—explains Scudo’s limited adoption, but the allocator has continued to evolve.

1. Fast Allocation

Two factors affect allocation speed: finding free memory and handling contention when multiple threads allocate simultaneously. Scudo addresses these with a Cache hierarchy and Thread‑Specific Data (TSD).

The Cache uses a multi‑level design. The first‑level cache is an array that holds a batch of memory blocks; allocation simply pops the last element, which is fast and cache‑friendly. When the first level is empty, the second level, a linked list of arrays, supplies more blocks. If the second level is also empty, Scudo mmaps a new region from the OS. This hierarchy reduces the time spent searching for free memory.

TSD assigns each thread its own first‑level cache, eliminating contention on the shared cache. By creating multiple first‑level caches and binding them to threads, Scudo reduces allocation wait time caused by lock contention.

2. Fast Deallocation

Deallocation mirrors allocation. Freed memory is first returned to the first‑level cache, then to the second level, and finally to the OS. Returning pages to the OS is costly, so the fastest path is to keep pages cached, but that can waste memory because the process’s RSS does not shrink.

Two main costs make page return expensive:

Fragmentation requires scanning memory to identify fully free pages.

The madvise system call used to release pages adds latency.

Scudo therefore applies three mitigation measures.

It avoids returning small‑region pages unless the region’s free memory exceeds a high threshold. Google’s experiment showed that only when free memory is above 97 % does returning pages for a 32‑byte region become worthwhile. The data are shown below:

Size: 32
92% freed -> 0% released
93% freed -> 0% released
94% freed -> 0% released
95% freed -> 1% released
96% freed -> 3% released
97% freed -> 7% released
98% freed -> 17% released
99% freed -> 41% released
Size: 48
92% freed -> 0% released
93% freed -> 0% released
94% freed -> 1% released
95% freed -> 3% released
96% freed -> 7% released
97% freed -> 13% released
98% freed -> 27% released
99% freed -> 52% released

From this data Scudo derives a threshold formula:

threshold = (100 - 1 - BlockSize/16) / 100

When free memory exceeds the threshold, Scudo limits page‑return frequency to once per second, preventing short‑term spikes that cause frame drops. It also requires a minimum increase in free bytes (delta‑free‑bytes) between returns, ensuring each return is worthwhile.

Beyond these measures, I suggested two further improvements to Google: using a bitmap to accelerate the scan for free pages, and offloading page‑return work to a dedicated thread to avoid impacting performance‑sensitive threads such as the UI thread. Google indicated the bitmap idea was previously considered and the threading solution is under development.

3. Efficient Memory Use

Efficient use means minimizing memory consumption while satisfying demand, which largely hinges on fragmentation control. Scudo lacks compaction or copying, so it cannot reorganize existing allocations. Instead, it tries to concentrate new allocations in contiguous regions, but this conflicts with its security goal of randomizing addresses.

Google’s compromise splits each 256 MiB region into many 256 KiB groups. Allocation always prefers the head of the free‑list group, reducing fragmentation, while the internal distribution within a group remains random to preserve security.

This grouping also benefits page return: each group can be reclaimed independently, allowing Scudo to target high‑efficiency groups without scanning the entire region.

Conclusion

From a broader perspective, the optimizations mirror everyday problem‑solving: prioritize the main conflict (Cache sacrifices memory for speed), apply cost‑benefit analysis (choose when to return pages), and use divide‑and‑conquer (group segmentation). Observing source code through such lenses reveals the design’s practical roots and makes the allocator’s behavior more approachable.

Cache Android security Thread Specific Data Memory Allocator Scudo

Written by

Linux Code Review Hub

A professional Linux technology community and learning platform covering the kernel, memory management, process management, file system and I/O, performance tuning, device drivers, virtualization, and cloud computing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.