Fundamentals 27 min read

Why Memory Alignment Matters in Linux: Boost Performance and Prevent Bugs

This article explains the concept of memory alignment in Linux, how alignment rules affect struct layout and CPU access, why proper alignment improves performance and portability, and provides practical compiler directives and code examples for implementing and testing aligned data structures.

Deepin Linux
Deepin Linux
Deepin Linux
Why Memory Alignment Matters in Linux: Boost Performance and Prevent Bugs

Memory alignment is not unique to Linux, but it plays a crucial role in Linux programming by organizing how data is stored in memory. Proper alignment can greatly improve performance, while misalignment can slow programs and cause hard‑to‑debug errors.

1. Introduction to Memory Alignment

1.1 What Is Memory Alignment?

Modern computers divide memory into bytes, but many CPUs prefer to read/write data at addresses that are multiples of the data size (e.g., 4‑byte or 8‑byte boundaries). Aligning data ensures that its address matches the CPU’s preferred boundaries, reducing the number of memory accesses required.

struct Data {
    char a;
    int b;
    short c;
};

Although the struct appears to need only 7 bytes, most compilers report a larger size because of alignment padding.

1.2 Why Is Memory Alignment Needed?

(1) Platform Compatibility

Some architectures (e.g., ARM) raise exceptions when data is accessed at unaligned addresses, causing crashes or severe performance loss. Aligning data ensures code runs correctly across different hardware platforms.

(2) Performance Optimization

When a 4‑byte integer is aligned on a 4‑byte boundary, the CPU can fetch it in a single cycle. If it is misaligned, the CPU must perform two reads and combine the results, increasing latency and reducing cache efficiency.

2. Linux Memory Alignment Rules

Linux follows clear alignment rules for basic types and structs, especially when using the GCC compiler.

2.1 Alignment Rules for Basic Types

char aligns to 1 byte, int to 4 bytes, double to 8 bytes. Example:

char ch = 'a';
int num = 100;
double d = 3.14;

Each variable is placed at an address that is a multiple of its alignment size.

2.2 Alignment Rules for Structs

(1) Member Offsets

The first member starts at offset 0. Subsequent members start at the smallest offset that is a multiple of their own alignment size, inserting padding as needed.

(2) Total Struct Size

The overall size of a struct must be a multiple of the largest member’s alignment. This ensures that arrays of the struct maintain proper alignment for each element.

(3) Example Analysis

struct Example {
    char c;
    int i;
    double d;
};

Here, c occupies 1 byte, three padding bytes are added so i starts at offset 4, and d starts at offset 8, making the total size 16 bytes.

2.3 Impact of Alignment on Code Performance

(1) Theoretical Analysis

Aligned data can be fetched in a single memory‑controller transaction, while misaligned data may span two transactions, requiring extra cycles and merging logic.

(2) Practical Test

The following C program compares execution time of an unaligned struct versus an aligned struct using __attribute__((aligned(8))):

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

// Unaligned struct
struct UnalignedStruct {
    char c;
    int i;
    double d;
};

// Aligned struct
struct __attribute__((aligned(8))) AlignedStruct {
    char c;
    int i;
    double d;
};

void testUnaligned() {
    struct UnalignedStruct us = {'a', 100, 3.14};
    clock_t start = clock();
    for (int i = 0; i < 100000000; i++) {
        double result = us.c + us.i + us.d;
        (void)result;
    }
    double time_spent = (double)(clock() - start) / CLOCKS_PER_SEC;
    printf("Unaligned struct time: %f seconds
", time_spent);
}

void testAligned() {
    struct AlignedStruct as = {'a', 100, 3.14};
    clock_t start = clock();
    for (int i = 0; i < 100000000; i++) {
        double result = as.c + as.i + as.d;
        (void)result;
    }
    double time_spent = (double)(clock() - start) / CLOCKS_PER_SEC;
    printf("Aligned struct time: %f seconds
", time_spent);
}

int main() {
    testUnaligned();
    testAligned();
    return 0;
}

Typical results on an Intel i7 show the aligned version runs noticeably faster, confirming the performance benefit of proper alignment.

3. Implementing Memory Alignment in Code

3.1 Compiler Directives

Linux developers can use #pragma pack(n) or __attribute__((aligned(n))) to control alignment.

#pragma pack(4)
struct Example {
    char c;
    int i;
    double d;
};
#pragma pack()

The above forces 4‑byte alignment for all members, even the double.

struct __attribute__((aligned(8))) AlignedExample {
    char c;
    int i;
    double d;
};

3.2 Optimization Techniques

(1) Order Members Wisely

Place smaller members together and larger members later to reduce padding. Example:

struct S1 { char c1; int i; char c2; };
struct S2 { char c1; char c2; int i; };

S2 occupies 8 bytes versus 12 bytes for S1.

(2) Use Alignment Macros

Linux kernel code often uses macros like _ALIGN to round addresses up to a given alignment:

#define _ALIGN(addr, size) (((addr) + (size) - 1) & ~((size) - 1))

Example: _ALIGN(10, 8) yields 16.

Custom allocation functions can apply this macro to ensure returned pointers meet alignment requirements while storing the original pointer for proper deallocation.

3.3 Sample Aligned Allocation Functions

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

static void* align_address(void* ptr, size_t alignment) {
    uintptr_t addr = (uintptr_t)ptr;
    size_t offset = (alignment - (addr % alignment)) % alignment;
    return (void*)(addr + offset);
}

void* allocate_memory(size_t size, size_t alignment) {
    if ((alignment & (alignment - 1)) != 0) {
        fprintf(stderr, "Alignment must be a power of two
");
        return NULL;
    }
    size_t extra = alignment + sizeof(void*);
    void* raw = malloc(size + extra);
    if (!raw) return NULL;
    void* aligned = align_address((char*)raw + sizeof(void*), alignment);
    *((void**)((char*)aligned - sizeof(void*))) = raw;
    return aligned;
}

void deallocate_memory(void* aligned) {
    if (!aligned) return;
    void* raw = *((void**)((char*)aligned - sizeof(void*)));
    free(raw);
}

int main() {
    size_t sizes[] = {10, 100, 512};
    size_t aligns[] = {4, 8, 16, 32};
    for (size_t s = 0; s < sizeof(sizes)/sizeof(sizes[0]); s++) {
        for (size_t a = 0; a < sizeof(aligns)/sizeof(aligns[0]); a++) {
            void* mem = allocate_memory(sizes[s], aligns[a]);
            if (!mem) continue;
            printf("Allocated size=%zu align=%zu address=%p %s
",
                   sizes[s], aligns[a], mem,
                   ((uintptr_t)mem % aligns[a]) == 0 ? "aligned" : "misaligned");
            deallocate_memory(mem);
        }
    }
    return 0;
}

These functions allocate extra space, compute an aligned address, store the original pointer before the aligned block, and correctly free the memory later.

By following alignment rules, using compiler directives, ordering struct members, and employing alignment utilities, developers can write portable, high‑performance Linux code that avoids hardware faults and maximizes cache efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance optimizationmemory alignmentC programmingStruct Layout
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.