Operations 35 min read

Master Core Dumps: From Generation to Debugging with GDB on Linux

This article explains what a Core Dump is, how Linux generates the ELF‑based snapshot when a program crashes, common causes such as memory errors or signal mishandling, essential system configurations, and step‑by‑step GDB techniques for analyzing and fixing the underlying bugs.

Deepin Linux
Deepin Linux
Deepin Linux
Master Core Dumps: From Generation to Debugging with GDB on Linux

Ever encountered a program that runs fine locally but crashes online with a vague "Segmentation fault"? The key to solving such crashes is the Core Dump – a snapshot of a program’s memory, registers, and stack at the moment of failure, allowing precise bug location without guesswork.

1. What is a Core Dump?

1.1 Core Dump file introduction

On Linux, a Core Dump (core file) is created when a process receives a fatal signal (e.g., SIGSEGV, SIGABRT) and the kernel writes the process’s address space and state to disk. The file acts like a "snapshot" of the crash, essential for post‑mortem debugging.

The Core Dump file consists of four parts: ELF header, program header table, NOTE segment, and LOAD segment.

1.2 Kernel view of Core Dump generation

The generation process involves four steps:

Step 1: A fatal signal (e.g., segmentation fault) triggers a hardware exception.

Step 2: The kernel freezes the process, then terminates it and starts writing the Core Dump.

Step 3: The Core Dump file is written, containing the virtual address space, CPU registers, thread info, and signal details.

Step 4: The kernel releases all resources and fully removes the process.

(1) Signal handling phase: do_signal

Before returning to user space, the kernel checks pending signals and calls do_signal to process them.

static void fastcall do_signal(struct pt_regs *regs) {
    siginfo_t info;
    int signr;
    struct k_sigaction ka;
    sigset_t *oldset;

    // Get the signal to deliver
    signr = get_signal_to_deliver(&info, &ka, regs, NULL);
    if (signr > 0) {
        // Signal‑specific handling, possibly generating a Core Dump
    }
}

If the signal requires a Core Dump, the kernel prepares the dump.

(2) Signal acquisition phase: get_signal_to_deliver

This function extracts a pending signal from the process’s queues and decides whether a Core Dump should be generated.

int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
                         struct pt_regs *regs, void *cookie) {
    sigset_t *mask = &current->blocked;
    int signr = 0;

    while ((signr = dequeue_signal(mask, &current->pending)) ||
           (signr = dequeue_signal(mask, &current->shared_pending))) {
        struct sigpending *pending;
        struct sigqueue *q;
        q = find_signal_queue(signr, &current->pending);
        if (!q)
            q = find_signal_queue(signr, &current->shared_pending);
        *info = q->info;
        *return_ka = current->sigaction[signr - 1];
        if (should_generate_coredump(signr)) {
            prepare_coredump();
        }
        return signr;
    }
    return 0;
}

(3) Memory information recording phase

After deciding to generate a dump, the kernel records the process’s memory layout. The Core Dump is an ELF file containing PT_NOTE (registers, task_struct, VMCOREINFO) and PT_LOAD segments (heap, stack, data, etc.).

2. Core Dump generation mechanism

2.1 Trigger conditions

Core Dumps are typically triggered by the following signals:

SIGSEGV (signal 11) : illegal memory access such as null‑pointer dereference.

#include <stdio.h>
#include <stdlib.h>
int main() {
    int *ptr = NULL;
    *ptr = 10; // triggers SIGSEGV
    return 0;
}

SIGABRT (signal 6) : abort() or failed assert.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
int main() {
    int num = 0;
    assert(num > 0); // triggers SIGABRT
    return 0;
}

SIGFPE (signal 8) : fatal arithmetic error such as division by zero.

#include <stdio.h>
int main() {
    int a = 10;
    int b = 0;
    int c = a / b; // triggers SIGFPE
    return 0;
}

2.2 Configuration points

Use ulimit -c unlimited to allow unlimited Core Dump size.

Ensure the program’s working directory is writable.

If the program changes its effective UID/GID, set /proc/sys/fs/suid_dumpable to 1.

Adjust /proc/sys/kernel/core_pattern to control dump location and naming.

3. Common causes of Core Dumps

3.1 Memory access errors

Null or wild pointer dereference – accessing a null pointer triggers SIGSEGV.

#include <iostream>
int main() {
    int *ptr = nullptr;
    *ptr = 10; // SIGSEGV
    return 0;
}

Buffer overflow – writing past an array’s bounds can corrupt memory and cause a crash.

#include <stdio.h>
int main() {
    char buffer[10];
    strcpy(buffer, "123456789012345"); // overflow
    return 0;
}

3.2 Improper signal handling

SIGSEGV, SIGABRT, SIGFPE – if not caught, the default action is to terminate and dump core.

3.3 Resource limits and configuration

If ulimit -c is 0 or disk space is insufficient, no Core Dump will be written.

3.4 Multithreading issues

Race conditions – unsynchronized access to shared data can lead to crashes.

#include <iostream>
#include <thread>
int sharedVariable = 0;
void increment() {
    for (int i = 0; i < 1000; ++i) {
        sharedVariable++; // no lock → race
    }
}
int main() {
    std::thread t1(increment), t2(increment);
    t1.join(); t2.join();
    std::cout << "Final: " << sharedVariable << std::endl;
    return 0;
}

Deadlock – two threads waiting on each other’s mutex can stall and eventually crash.

#include <iostream>
#include <thread>
#include <mutex>
std::mutex m1, m2;
void f1() { m1.lock(); std::this_thread::sleep_for(std::chrono::milliseconds(100)); m2.lock(); m2.unlock(); m1.unlock(); }
void f2() { m2.lock(); std::this_thread::sleep_for(std::chrono::milliseconds(100)); m1.lock(); m1.unlock(); m2.unlock(); }
int main() { std::thread t1(f1), t2(f2); t1.join(); t2.join(); return 0; }

3.5 Dynamic memory management errors

Double free – freeing the same pointer twice leads to undefined behavior.

#include <stdio.h>
#include <stdlib.h>
int main() {
    int *ptr = (int *)malloc(sizeof(int));
    free(ptr);
    free(ptr); // double free → crash
    return 0;
}

Memory leak – continuous allocation without release can exhaust memory and cause a crash.

#include <iostream>
void leak() { while (true) { int *p = new int; } }
int main() { leak(); return 0; }

3.6 Program logic errors

Infinite recursion – eventually overflows the stack.

#include <stdio.h>
void recur() { recur(); }
int main() { recur(); return 0; }

Uncaught exception – propagates to termination.

#include <iostream>
void thrower() { throw 1; }
int main() {
    try { thrower(); } catch(...) {}
    return 0;
}

3.7 Hardware problems

Faulty RAM can cause SIGSEGV‑like crashes.

CPU overheating or defects may also trigger crashes.

4. Core Dump analysis and debugging

4.1 Using GDB to analyze a Core Dump

Load the executable and core file: gdb ./my_program core.1234 Use bt to view the backtrace, p to inspect variables, and info registers to see CPU state.

(gdb) bt
#0  func3 (arg1=0x7fffffffde10, arg2=42) at my_file.c:123
#1  0x00005555555552b5 in func2 (arg=0x7fffffffde10) at main.c:234
#2  0x0000555555555350 in main () at main.c:345

4.2 Check system configuration

Ensure ulimit -c is not zero: ulimit -c Set it to unlimited if needed: ulimit -c unlimited Verify write permission on the dump directory, e.g., /var/core:

ls -l /var/core
sudo chmod 777 /var/core

4.3 GDB loading Core Dump

gdb my_program core.1234

After loading, use where / bt, p, and info registers as needed.

4.4 Useful GDB commands

where

/ bt – display call stack. p VAR – print variable value. info registers – show CPU registers at crash.

5. Core Dump practical cases

5.1 Simple case analysis

#include <stdio.h>
#include <stdlib.h>
void func() { int *ptr = NULL; *ptr = 10; }
int main() { func(); return 0; }

Compile with -g, run, then analyze with GDB:

gdb test core.12345
(gdb) bt
#0  func () at test.c:5
#1  main () at test.c:9
(gdb) p ptr
$1 = (int *) 0x0

The backtrace shows the crash at line 5, and ptr is null.

5.2 Complex multithreaded scenario

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#define ARRAY_SIZE 100
int shared_array[ARRAY_SIZE];
pthread_mutex_t mutex;
void *write_thread(void *arg) {
    for (int i = 0; i < ARRAY_SIZE; i++) {
        pthread_mutex_lock(&mutex);
        shared_array[i] = i;
        pthread_mutex_unlock(&mutex);
    }
    return NULL;
}
void *read_thread(void *arg) {
    for (int i = 0; i < ARRAY_SIZE; i++) {
        pthread_mutex_lock(&mutex);
        int value = shared_array[i];
        printf("Read value: %d at index %d
", value, i);
        pthread_mutex_unlock(&mutex);
    }
    return NULL;
}
int main() {
    pthread_t w, r;
    pthread_mutex_init(&mutex, NULL);
    pthread_create(&w, NULL, write_thread, NULL);
    pthread_create(&r, NULL, read_thread, NULL);
    pthread_join(w, NULL);
    pthread_join(r, NULL);
    pthread_mutex_destroy(&mutex);
    return 0;
}

After a crash, load the core file:

gdb multi_thread_test core.67890
(gdb) info threads
(gdb) thread 2   # switch to the read thread
(gdb) bt
#0  read_thread (arg=0x0) at multi_thread_test.c:20
#1  pthread_mutex_lock ()
#2  main () at multi_thread_test.c:28
(gdb) p i
$1 = 120

The index i exceeds ARRAY_SIZE, revealing an out‑of‑bounds read that caused the Core Dump.

GDBsystem configurationCore Dumpsegmentation faultlinux debugging
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.