Fundamentals 19 min read

How Android’s Linker Loads and Links Native .so Files (and What It Means for Packing)

This article explains Android's linker workflow for loading and linking native .so libraries, detailing the do_dlopen sequence, ELF parsing, memory mapping, soinfo allocation, relocation handling, and constructor calls, and concludes with a brief overview of common SO packing techniques.

Tencent TDS Service
Tencent TDS Service
Tencent TDS Service
How Android’s Linker Loads and Links Native .so Files (and What It Means for Packing)

1. Introduction

Android system security is increasingly important; similar to executable hardening on PCs, application hardening—especially native .so protection—is a crucial part of Android security. Native protection focuses on the SO files in the native layer, using techniques such as packing, anti‑debugging, obfuscation, and VM‑based tricks to raise reverse‑engineering difficulty. Understanding the linker and its loading/linking mechanism is essential for both security researchers and developers.

This article analyses the linker's loading and linking process for SO files and briefly introduces key packing technologies.

The discussion is limited to the handling of dlopen("libxx.so") on Android 5.0 AOSP source for the ARM platform; source snippets have been trimmed for readability.

P.S.: Readers should have a basic understanding of ELF file structure.

2. SO Loading and Linking

2.1 Overall Process

1. do_dlopen After dlopen is called, the flow passes through dlopen_ext and reaches the main function do_dlopen:

do_dlopen

calls two important functions: find_library (which continues the loading/linking) and the soinfo member CallConstructors (which invokes the SO’s initialization functions).

2. find_library_internal find_library directly invokes find_library_internal:

find_library_internal

first checks whether the target SO is already loaded via find_loaded_library_by_name. If not, it calls load_library to continue the loading process.

3. load_library

load_library

implements the whole SO loading and linking flow in three steps:

Loading : create an ElfReader object and call its Load method to map the SO into memory.

Allocating soinfo : invoke soinfo_alloc to allocate a new soinfo structure and fill it with the loading results.

Linking : call soinfo_link_image to complete the linking.

The remainder of this section examines the ElfReader class and soinfo_link_image in detail.

2.2 Loading

In load_library, the ElfReader is instantiated with the SO name and file descriptor: ElfReader elf_reader(name, fd) Then elf_reader.Load() is called.

The Load method reads the ELF header, validates it, reads the program header, calculates the required memory size, allocates space with mmap, and maps each PT_LOAD segment into memory.

2.2.1 Read & Verify ELF Header

ReadElfHeader

reads the ELF header into header_ (type Elf32_Ehdr) and validates magic bytes, class (32/64‑bit), endianness, file type, version, and target platform.

2.2.2 Read Program Header

The program header table is temporarily mapped for parsing and released after the SO is fully loaded.

2.2.3 Reserve Space & Compute Load Size

phdr_table_get_load_size

iterates over PT_LOAD segments to find the minimum virtual address and the maximum end address, aligns them to page boundaries, and computes load_size. The loader then reserves this size with mmap.

About load_bias : If an SO specifies a non‑page‑aligned base address, the actual mapping address differs by load_bias . For ordinary SOs, min_vaddr = 0 and load_bias = load_start , which is treated as the base address.

2.2.4 Load Segments

For each PT_LOAD segment, the loader:

Computes seg_start and seg_end using load_bias, aligns them to page boundaries.

Computes the file‑page start and length.

Calls mmap with the calculated addresses and lengths.

2.3 Allocating soinfo

After loading, load_library calls soinfo_alloc to allocate a soinfo structure for the SO. The soinfo holds loading, symbol, relocation, and initialization information used later by the linker and at runtime.

Key fields used during loading/linking include phdr, phnum, base, size, symbol tables, relocation tables, and init/fini arrays.

2.4 Linking

The linking is performed by soinfo_link_image and consists of four steps:

Locate the dynamic section via phdr_table_get_dynamic_section.

Parse the dynamic section (array of Elf32_Dyn entries) to obtain symbol, relocation, and init/fini information.

Load any needed dependent SOs by calling find_library.

Perform relocation, the most complex part, by fixing imported symbol references.

2.4.1 Relocation

Android ARM processes two relocation tables: plt_rel (for PLT entries) and rel. Both are handled by soinfo_relocate, which iterates over each relocation entry, determines the type, symbol index, and target address, looks up imported symbols if needed, and patches the target address accordingly.

2.5 CallConstructors

When compiling an SO, the -init linker option or the __attribute__((constructor)) attribute can designate initialization functions. These functions are invoked after the SO is loaded and linked, before the dlopen call returns.

After do_dlopen obtains the soinfo of the newly loaded SO, it finally calls CallConstructors, which recursively invokes constructors of dependent SOs and then runs the SO’s own init functions and init_array entries.

3. Packing Techniques

In the malware and DRM fields, “packing” (or “shelling”) is used to compress and encrypt code, often combined with virtualization, obfuscation, and anti‑debugging to hinder static and dynamic analysis.

For Android native libraries, packing targets the SO file. The typical architecture consists of three components:

SO : the protected target library.

Loader : a small SO that is loaded first, restores the encrypted/compressed SO in memory, loads it, and performs linking so that the protected SO can be used.

Packing tool : creates the encrypted/compressed payload and merges it with the loader to produce a packed SO.

3.1 Loader Execution Timing

The loader must run before the protected SO is used. This can be achieved via the SO’s init/init_array functions or via JNI_OnLoad.

3.2 Loader Performs Loading and Linking

After restoring the SO in memory, the loader repeats the linker's loading and linking steps, with the main difference being that it reads from memory instead of a file descriptor.

3.2.1 Loading

The loader follows the same two‑step process as the linker for PT_LOAD segments, adjusting for the fact that the source data resides in memory.

3.2.2 Allocating soinfo

The loader can reuse the linker's soinfo structure to store intermediate information, then copy the relevant fields back to the linker's soinfo after loading.

3.2.3 Linking

Linking is identical to the linker's process; after linking, the loader must also invoke the SO’s init functions.

3.3 soinfo Repair

Because the system’s linker maintains a soinfo for the loader, the loader must patch this structure with the real SO’s information (base, size, load_bias, symbol tables, bucket/chains, ARM exception tables, etc.) so that subsequent dlsym lookups work correctly.

References

<<Linkers and loaders>>
<<ELF for the ARM Architecture>>
nativeAndroidsecurityELFLinkerpackingSO loading
Tencent TDS Service
Written by

Tencent TDS Service

TDS Service offers client and web front‑end developers and operators an intelligent low‑code platform, cross‑platform development framework, universal release platform, runtime container engine, monitoring and analysis platform, and a security‑privacy compliance suite.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.