Why Linux Threads Crash: Uncovering Thread‑Stack Pitfalls and Fixes
This article explains how Linux thread stacks work, why improper stack size or unchecked recursion can cause crashes and memory leaks, and provides practical debugging tools, stack‑size tuning methods, and optimization techniques—including thread‑pool usage and real‑world case studies—to keep concurrent programs stable and efficient.
1. Linux Thread‑Stack Basics
Each thread in Linux has a private memory region called a thread stack. The stack stores function parameters, local variables, return addresses and thread‑control data. Unlike the process stack, which is created at program start and can grow dynamically, a thread stack is allocated on demand with mmap and has a fixed size unless explicitly changed.
Process (main) stack : created during fork(), grows downward until kernel limits are reached; accessing unmapped pages does not immediately segfault.
Child thread stack : created by pthread_create() as an anonymous mapping; it cannot grow – exhaustion triggers a segmentation fault.
Thread creation consists of two parts: the glibc wrapper that calls pthread_create in user space, and the kernel clone system call that creates a lightweight process sharing the address space.
When a thread’s stack cannot be reused from cache, the kernel allocates a new anonymous region with mmap. The struct pthread control block is placed at the high address of the region and the usable stack space lies below it.
2. Linux Thread‑Stack Working Principle
2.1 Memory Layout
The stack grows from high addresses toward low addresses. When a function is called, its parameters, locals and return address are pushed onto the stack.
int add_numbers(int a, int b) {
int sum = a + b;
return sum;
}During the call, a, b and sum occupy stack slots; after the function returns they are popped.
2.2 Creation Process
Threads are created with pthread_create:
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine)(void *), void *arg);If attr is NULL, the default stack size (usually 8 MiB) is used. The current default can be queried with:
$ ulimit -s
8192To set a custom size temporarily: $ ulimit -s 4096 # 4 MiB Or configure it programmatically with pthread_attr_setstacksize:
#include <pthread.h>
#include <stdio.h>
void *thread_function(void *arg) { return NULL; }
int main() {
pthread_t t;
pthread_attr_t attr;
size_t stack_size = 2 * 1024 * 1024; // 2 MiB
pthread_attr_init(&attr);
pthread_attr_setstacksize(&attr, stack_size);
if (pthread_create(&t, &attr, thread_function, NULL) != 0) {
perror("pthread_create");
return 1;
}
pthread_join(t, NULL);
pthread_attr_destroy(&attr);
return 0;
}3. Common Thread‑Stack Problems & Troubleshooting
3.1 Stack Overflow
Typical causes are deep recursion or large local arrays. The following program crashes with a segmentation fault because the recursive function never terminates and allocates a huge local array:
#include <stdio.h>
void recursive_function() {
int large_array[1000000]; // ~4 MiB on 32‑bit int
recursive_function();
}
int main() { recursive_function(); return 0; }3.2 Thread‑Stack Memory Leak
If a thread finishes without being pthread_join ed or detached, its stack remains allocated, eventually exhausting system memory. Example:
#include <pthread.h>
#include <stdio.h>
void *thread_func(void *arg) { return NULL; }
int main() {
pthread_t t;
for (int i = 0; i < 1000; ++i) {
if (pthread_create(&t, NULL, thread_func, NULL) != 0) {
perror("pthread_create");
return 1;
}
// No join or detach → leak
}
return 0;
}Detect leaks with Valgrind:
valgrind --leak-check=full --show-leak-kinds=all ./your_program3.3 Diagnostic Tools
top -H -p <PID>– per‑thread CPU and memory usage. pstack <PID> – prints stack traces of all threads. gdb -p <PID> → info threads / bt – detailed backtraces.
4. Thread‑Stack Optimization Strategies
4.1 Adjust Stack Size
Temporary change: ulimit -s 16384 # 16 MiB (value is in KiB) Permanent per‑user limits can be set in /etc/security/limits.conf:
username hard stack 16384
username soft stack 163844.2 Reduce Stack Usage
Avoid large stack allocations; allocate big buffers on the heap instead:
#include <stdio.h>
#include <stdlib.h>
int main() {
int *large_array = malloc(1000000 * sizeof(int));
if (!large_array) { perror("malloc"); return 1; }
// use array
free(large_array);
return 0;
}Static or global storage can be used for long‑lived data, but must be protected with proper synchronization.
4.3 Use a Thread Pool
Thread pools reuse a fixed set of threads, eliminating the overhead of frequent creation and destruction. Java’s ThreadPoolExecutor constructor:
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize,
long keepAliveTime, TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler)Typical usage:
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ThreadPoolExample {
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(5);
for (int i = 0; i < 10; i++) {
int task = i;
executor.submit(() -> {
System.out.println("Task " + task + " executed by " +
Thread.currentThread().getName());
try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); }
});
}
executor.shutdown();
}
}5. Real‑World Case Study: Game‑Server Crash
5.1 Background
A multiplayer game server crashed frequently under load. Logs showed segmentation faults in threads that performed complex game‑logic processing.
5.2 Investigation
The default stack size was 8 MiB ( ulimit -s). Using pstack and gdb, developers observed extremely deep call stacks and recursive functions without proper termination, confirming stack overflow.
5.3 Fixes Implemented
Key changes:
Add base cases to recursive algorithms.
Move large temporary buffers from stack to heap.
Introduce recursion depth limits.
Sample corrected code:
void safeRecursion(int depth) {
if (depth <= 0) return;
if (depth > 100) { std::cout << "Warning: recursion depth too large" << std::endl; return; }
int smallBuffer[128]; // tiny stack allocation
safeRecursion(depth - 1);
}
class BigDataProcessor {
int *heapData;
size_t dataSize;
public:
BigDataProcessor(size_t size) : dataSize(size) { heapData = new int[size]; }
~BigDataProcessor() { delete[] heapData; }
void process() { for (size_t i = 0; i < dataSize; ++i) heapData[i] = i*i; safeRecursion(10); }
};After redeploying with these changes and running stress tests, the server no longer crashed, demonstrating that proper stack management resolves the issue.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
