Fundamentals 29 min read

How HWAddressSanitizer Leverages AArch64 TBI to Detect Memory Errors Efficiently

This article explains the principles behind HWAddressSanitizer (HWASAN), how it uses the AArch64 Top‑Byte‑Ignore hardware feature to provide lower‑overhead memory error detection than ASAN, and walks through practical examples, implementation details, and LLVM integration for C/C++ developers.

ByteDance SYS Tech
ByteDance SYS Tech
ByteDance SYS Tech
How HWAddressSanitizer Leverages AArch64 TBI to Detect Memory Errors Efficiently

Introduction

HWASAN (Hardware‑assisted AddressSanitizer) is a memory‑error detection tool for C/C++ that builds on the Top‑Byte‑Ignore (TBI) feature of AArch64, offering lower memory overhead and broader error coverage compared to the traditional AddressSanitizer (ASAN).

Background

At ByteDance, C++ is widely used, leading to many memory‑related bugs. Existing sanitizers like ASAN have helped fix hundreds of defects, but their performance cost remains a concern. The TBI hardware feature enables a more efficient solution, prompting the STE team to evaluate HWASAN on large‑scale services.

HWASAN Overview

HWASAN uses partial hardware assistance: it relies on the TBI feature to store a tag in the top byte of a pointer and a corresponding shadow tag for each 16‑byte memory granule.

TBI (Top Byte Ignore) feature of AArch64: bits [63:56] are ignored in address translation and can be used to store a tag.

Example code shows how to set the top byte of a pointer without affecting program execution.

<code>// $ cat tbi.cpp
int main(int argc, char **argv) {
  int * volatile x = (int *)malloc(sizeof(int));
  *x = 666;
  printf("address: %p, value: %d\n", x, *x);
  x = reinterpret_cast<int*>(reinterpret_cast<uintptr_t>(x) | (0xfeULL << 56));
  printf("address: %p, value: %d\n", x, *x);
  free(x);
  return 0;
}
// $ clang++ tbi.cpp && ./a.out
address: 0xaaab1845fe70, value: 666
address: 0xfe00aaab1845fe70, value: 666</code>

HWASAN detects memory errors by comparing the tag stored in a pointer’s top byte with the tag in shadow memory. If they differ, a violation is reported.

Detecting a Heap‑Buffer‑Overflow

<code>// cat test.c
#include <stdlib.h>
int main() {
    int * volatile x = (int *)malloc(sizeof(int)*10);
    x[10] = 0; // out‑of‑bounds write
    free(x);
}
</code>

Running the program with HWASAN:

<code>$ clang -fuse-ld=lld -g -fsanitize=hwaddress ./test.c && ./a.out
==3581920==ERROR: HWAddressSanitizer: tag-mismatch on address 0xec2bfffe0028 at pc 0xaaad830db1a4
WRITE of size 4 at 0xec2bfffe0028 ...
SUMMARY: HWAddressSanitizer: tag-mismatch ./test.c:4:11 in main
</code>

The report format is similar to ASAN, but HWASAN’s memory overhead is lower.

Technical Comparison: ASAN vs HWASAN

ASAN uses shadow memory at a 1‑byte‑per‑8‑bytes ratio and relies on redzones and quarantine to detect overflows and use‑after‑free.

HWASAN also uses shadow memory but at a 1‑byte‑per‑16‑bytes ratio and does not need redzones or quarantine; it relies solely on TBI‑based tagging.

Algorithm Details

Shadow memory maps each 16‑byte application block to one shadow byte.

All heap/stack/global objects are aligned to 16 bytes.

A random tag is generated for each object and stored both in the pointer’s top byte and in the corresponding shadow byte.

Before every memory access, instrumentation checks that the pointer tag matches the shadow tag.

Implementation Highlights

HWASAN uses dynamic shadow mapping on Linux/AArch64. The shadow region is allocated with mmap based on the maximum user virtual address, and the mapping layout (HighMem, HighShadow, ShadowGap, LowShadow, LowMem) is computed at runtime.

<code>uptr MemToShadow(uptr addr) { return (addr >> 4) + __hwasan_shadow_memory_dynamic_address; }
uptr ShadowToMem(uptr shadow_addr) { return (shadow_addr - __hwasan_shadow_memory_dynamic_address) << 4; }
</code>

Instrumentation inserts calls to the intrinsic llvm.hwasan.check.memaccess.shortgranules before each load/store. The intrinsic is lowered to assembly that reads the shadow tag, compares it with the pointer tag, and invokes the runtime error handler on mismatch.

<code>__hwasan_check_x0_18_short_v2:
  sbfx    x16, x0, #4, #52    // shadow offset
  ldrb    w16, [x20, x16]    // load shadow tag
  cmp     x16, x0, lsr #56   // compare tags
  b.ne    .Ltmp0            // mismatch -> handler
  ...
</code>

Short Granules

When an object’s size is not a multiple of 16 bytes, the remaining bytes form a “short granule”. The shadow byte stores the size of the short granule, allowing HWASAN to detect accesses that exceed the valid portion of the last 16‑byte block.

HWASAN short granule illustration
HWASAN short granule illustration

Open‑Source Contributions

The STE team fixed several HWASAN bugs (multithreaded race conditions, match‑all‑tag false positives, etc.) and contributed patches to the LLVM project, which have been merged into the mainline.

https://reviews.llvm.org/D147215

https://reviews.llvm.org/D147121

https://reviews.llvm.org/D149252

https://reviews.llvm.org/D148909

https://reviews.llvm.org/D149580

https://reviews.llvm.org/D149943

References

Hardware‑assisted AddressSanitizer Design Documentation: https://arxiv.org/pdf/1802.09517.pdf

HWASAN source repository: https://github.com/google/sanitizers/tree/master/hwaddress-sanitizer

Memory DebuggingAarch64Shadow MemoryAddressSanitizerHWASAN
ByteDance SYS Tech
Written by

ByteDance SYS Tech

Focused on system technology, sharing cutting‑edge developments, innovation and practice, and analysis of industry tech hotspots.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.