How to Capture Precise Android Stack Traces: Native vs Instrumentation Methods
This article examines why a single Thread.currentThread().getStackTrace() call is insufficient for modern Android performance debugging, compares instrumentation and native stack‑capture approaches, and provides detailed step‑by‑step implementations, code snippets, and optimization tips for reliable stack tracing.
Background
Originally a single call to Thread.currentThread().getStackTrace() could retrieve a thread’s stack trace. Modern Android applications need continuous stack sampling to diagnose ANR, jank, and crashes.
Stack‑capture approaches
Two main schemes are used:
Method‑level instrumentation (code injection)
Native stack capture
Method‑instrumentation stack capture
Concept
During compilation each method is instrumented with a unique identifier. When a performance issue occurs the collected IDs are aggregated and emitted.
Pros : simple, low overhead, no compatibility problems.
Cons : disables pre‑verification, increases class‑loading work and APK size, and can obscure line numbers.
Native stack capture
Java‑level stack‑trace flow
The Java API Thread.currentThread().getStackTrace() forwards to native code via VMStack.getThreadStackTrace. The native implementation performs three steps:
Suspend the target thread.
Execute a callback that builds an internal stack trace.
Resume the thread.
Key native function (simplified):
static jobjectArray VMStack_getThreadStackTrace(JNIEnv* env, jclass, jobject javaThread) { … }The callback creates two visitors: FetchStackTraceVisitor – shallow walk, max depth 256. BuildInternalStackTraceVisitor – continues the walk and builds an ObjectArray of ArtMethod objects.
Stack walk implementation
StackVisitor::WalkStackiterates over the managed‑stack linked list and invokes VisitFrame() for each frame, extracting the ArtMethod and storing it.
void StackVisitor::WalkStack(bool include_transitions) { … }Conversion to StackTraceElement
The internal ArtMethod array is transformed by Thread::InternalStackTraceToStackTraceElementArray into Java StackTraceElement objects, extracting method name, class name, source file, line number, etc.
jobjectArray Thread::InternalStackTraceToStackTraceElementArray(...){ … }The most time‑consuming part is the string handling required for this conversion.
Optimised native stack capture
To reduce overhead we can re‑implement the critical native stages, omit string decoding, and store raw ArtMethod pointers in a circular buffer for later asynchronous processing.
Implementation steps
Suspend the target thread and obtain its address.
Invoke the stack‑walk routine and record raw ArtMethod pointers.
Resume the thread.
Accessing the internal ThreadList
The ThreadList object is not publicly exposed. It can be located via the Runtime singleton obtained from JavaVMExt in JNI_OnLoad. Offsets to the thread_list_ field differ across Android versions.
JNIEXPORT jint JNICALL JNI_OnLoad(JavaVM *vm, void *reserved) { … }Resolving required symbols
Functions SuspendThreadByPeer, Resume, and StackVisitor::WalkStack are not exported. They are located in libart.so with dlopen / dlsym (e.g., using the helper library Nougat_dlfunctions).
WalkStack_ = reinterpret_cast<void(*)(StackVisitor*,bool)>(dlsym_ex(handle,
"_ZN3art12StackVisitor9WalkStackILNS0_16CountTransitionsE0EEEvb"));
SuspendThreadByThreadId_ = reinterpret_cast<void*(*)(void*,uint32_t,SuspendReason,bool*)>(dlsym_ex(handle,
"_ZN3art10ThreadList23SuspendThreadByThreadIdEjNS_13SuspendReasonEPb"));
Resume_ = reinterpret_cast<bool(*)(void*,void*,SuspendReason)>(dlsym_ex(handle,
"_ZN3art10ThreadList6ResumeEPNS_6ThreadENS_13SuspendReasonE"));Custom stack‑trace visitor
A lightweight visitor stores each ArtMethod pointer in a fixed‑size circular buffer without immediate decoding.
class CustomFetchStackTraceVisitor : public StackVisitor {
bool VisitFrame() override {
void* method = GetMethod();
if (CustomFetchStackTraceVisitorCallback != nullptr) {
return CustomFetchStackTraceVisitorCallback(method);
}
return true;
}
};Considerations
Version‑specific differences in Runtime layout and symbol names require runtime detection and offset calculation. Compatibility can be improved by filtering invalid addresses or maintaining a blacklist of unsupported devices.
References
Nougat_dlfunctions: https://github.com/avs333/Nougat_dlfunctions
Circular buffer concept: https://baike.baidu.com/item/环形缓冲区/22701730
Android method‑trace implementation analysis: https://zhuanlan.zhihu.com/p/526960193?utm_id=0
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
