Mastering iOS Crash Monitoring: From Mach Exceptions to KSCrash Implementation
This article explains why iOS apps crash after release, outlines common coding mistakes that cause crashes, introduces a layered iOS exception architecture, and provides detailed implementation steps for Mach exception, Unix signal, runtime, and application‑level monitoring using KSCrash with code samples and diagrams.
Background
After an app goes live, developers often encounter crashes that were not seen during offline testing. Understanding how crash logs are collected and why they occur is essential for reliable releases.
Common Crash Causes
Array out‑of‑bounds access.
Multithreading issues, such as UI updates on background threads or data races.
Main‑thread unresponsiveness leading to Watchdog termination.
Dangling pointers (wild pointers) accessing deallocated objects.
iOS Exception Architecture
The iOS exception system is organized into four layers:
Hardware layer : CPU exceptions like illegal instructions or memory errors.
System layer : Mach exceptions and their conversion to Unix signals (e.g., SIGSEGV, SIGABRT).
Runtime layer : Objective‑C NSException and C++ exceptions.
Application layer : Business‑logic errors, performance problems (deadlocks, memory leaks), and zombie‑object accesses.
The following diagram illustrates the layered relationship:
Monitoring Strategy
System‑level monitoring : Capture all low‑level exceptions via Mach exceptions and Unix signals.
Runtime‑level monitoring : Install NSUncaughtExceptionHandler and a C++ std::terminate handler.
Application‑level monitoring : Detect deadlocks with a watchdog thread and identify zombie objects through dealloc hooks.
Mach Exception Capture
Mach exceptions are the lowest‑level mechanism in macOS/iOS, originating from the Mach micro‑kernel. To capture them:
Create a new exception port:
// Create a new exception handling port
mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_RECEIVE, &g_exceptionPort);
// Insert send right
mach_port_insert_right(mach_task_self(), g_exceptionPort, g_exceptionPort, MACH_MSG_TYPE_MAKE_SEND);Register the port for all exception masks:
// Set exception ports to capture all types
task_set_exception_ports(mach_task_self(), EXC_MASK_ALL, g_exceptionPort, EXCEPTION_DEFAULT, MACHINE_THREAD_STATE);Spawn two handler threads (primary and secondary) to ensure the handler itself does not crash:
// Primary exception handling thread
pthread_create(&g_primaryPThread, &attr, handleExceptions, kThreadPrimary);
// Secondary backup thread
pthread_create(&g_secondaryPThread, &attr, handleExceptions, kThreadSecondary);When an exception occurs, the handler suspends all threads, marks the exception as captured, activates the backup thread, reads the faulting thread’s machine state, builds an exception context (type, address, stack cursor), and finally resumes the threads.
Unix Signal Capture
Signals complement Mach exception handling. Install handlers for fatal signals:
// Get list of fatal signals
const int *fatal_signals = signal_fatal_signals();
// Configure signal action
struct sigaction action = {0};
action.sa_flags = SA_SIGINFO | SA_ONSTACK;
action.sa_sigaction = &signal_handle_signals;
// Install handler for each signal
sigaction(fatal_signal, &action, &previous_signal_handler);The handler receives the signal number, siginfo_t, and CPU context, then processes the exception similarly to Mach handling.
Runtime Exception Capture
NSException handling:
// Save previous handler
NSUncaughtExceptionHandler *previous = NSGetUncaughtExceptionHandler();
// Install custom handler
NSSetUncaughtExceptionHandler(&handle_uncaught_exception);After processing, the previous handler is invoked and the process is terminated via abort().
C++ exception handling:
// Save original terminate handler
std::terminate_handler original = std::get_terminate();
// Install custom handler
std::set_terminate(cpp_exception_terminate_handler);The custom handler records the exception context before delegating to the original handler.
Deadlock Detection
A watchdog thread periodically posts a heartbeat task to the main thread. If the main thread does not respond within a configurable timeout, a deadlock is reported.
// Monitor thread checks main‑thread responsiveness
while (true) {
postHeartbeat();
if (!heartbeatReceivedWithin(timeout)) {
reportDeadlock();
}
sleep(interval);
}Zombie Object Detection
Zombie detection hooks the dealloc method of NSObject and NSProxy. When an object is deallocated, its hash is stored along with class information. Subsequent accesses check this table; if a match is found, the access is reported as a zombie.
// Hook dealloc
void (*originalDealloc)(id, SEL) = (void *)class_getMethodImplementation([NSObject class], @selector(dealloc));
void hookedDealloc(id self, SEL _cmd) {
recordZombieHash((uintptr_t)self);
originalDealloc(self, _cmd);
}The table size is limited to 0x8000 entries (32768) to bound memory usage.
Stack Unwinding and Symbolication
To reconstruct a crash stack, the system walks the frame pointer chain. For ARM64, the return address is adjusted by clearing the low two bits and subtracting one:
uintptr_t address = (return_address &~ 3UL) - 1;Runtime symbolication uses dladdr() to obtain the image base, image name, symbol address, and symbol name. Full symbolication (with line numbers) requires the dSYM file.
Asynchronous Safety
All code executed inside Mach or signal handlers must be async‑safe. Functions that may allocate memory, acquire locks, or perform I/O (e.g., malloc, free, NSLog, printf, Objective‑C method calls) are prohibited because the process state may be unstable.
Conclusion and Outlook
The article presented a comprehensive iOS crash‑monitoring solution built on KSCrash, covering Mach exceptions, Unix signals, runtime exceptions, deadlock and zombie detection, stack unwinding, and async‑safe handling. The implementation is used in Alibaba Cloud RUM iOS SDK and can be extended with real‑time upload, log collection, and memory‑dump features. For integration details, refer to the official documentation and join the RUM user‑experience monitoring support group.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
