Mobile Development 32 min read

How ANRCanary Uncovers Hidden Android ANR Causes and Boosts App Performance

This article explains how the ANRCanary tool helps DingTalk developers identify and resolve various Android ANR scenarios—ranging from main‑thread tasks and synchronization blocks to message‑queue congestion, emoji handling, and cross‑process SharedPreferences—while sharing practical best‑practice guidelines for building reliable mobile performance tools.

Alibaba Terminal Technology
Alibaba Terminal Technology
Alibaba Terminal Technology
How ANRCanary Uncovers Hidden Android ANR Causes and Boosts App Performance

Introduction

After ANRCanary was released on DingTalk, it provided strong support for ANR problem handling. The article selects typical DingTalk ANR cases, demonstrates ANRCanary's effectiveness in locating root causes, and summarizes DingTalk's thoughts on ANR governance from tool construction and best‑practice perspectives.

1. Real‑World Cases

1.1 Thread‑type: Startup task runs on the main thread

现场信息

{
    "cpuDuration": 31,
    "messageStr": ">>>>> Dispatching to Handler (elz) {54931ac} itl$3@d29c599: 0",
    "threadStackList": [
        {
            "stackTrace": [
                "java.util.ArrayList.<init>(ArrayList.java:164)",
                "com.alibaba.dingtalk.android.XXXStatistic.i(SourceFile:???)",
                "com.alibaba.dingtalk.android.XXXStatistic.a(SourceFile:???)",
                "odj.execute(SourceFile:???)",
                "itl$3.run(SourceFile:???)",
                "android.os.Handler.handleCallback(Handler.java:900)",
                "android.os.Handler.dispatchMessage(Handler.java:103)",
                "android.os.Looper.loop(Looper.java:219)",
                "android.app.ActivityThread.main(ActivityThread.java:8668)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:513)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1109)"
            ],
            "state": "RUNNABLE",
            "wallTime": 411
        }
    ],
    "type": "HUGE",
    "wallDuration": 488
}

信息解读

The task is of type HUGE and lasted 488 ms; the stack trace shows execution up to 411 ms.

The task corresponds to a custom Handler class elz with an anonymous

Runnable
itl$3

.

问题结论

It is a statistics task that runs during startup, calls a system service across processes, and is UI‑independent, so it should be moved to a background thread. Fixing it simply requires switching the task to a sub‑thread.

1.2 Synchronous: Main thread waits for network timeout in a child thread

现场信息

{
    "messageStr": ">>>>> Dispatching to Handler (hfs) {d16871b} cyk$5@a26c4b7: 0",
    "threadStackList": [
        {
            "stackTrace": [
                "sun.misc.Unsafe.park(Native Method)",
                "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:230)",
                "java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1063)",
                "java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1358)",
                "java.util.concurrent.CountDownLatch.await(CountDownLatch.java:278)",
                "com.alibaba.dingtalk.android.xxx.waitTimeout(SourceFile:???)",
                "com.alibaba.dingtalk.android.xxx.rpc(SourceFile:??)",
                "cyk$5.run(SourceFile:???)",
                "android.os.Handler.handleCallback(Handler.java:938)",
                "android.os.Handler.dispatchMessage(Handler.java:99)",
                "android.os.Looper.loop(Looper.java:257)",
                "android.app.ActivityThread.main(ActivityThread.java:8399)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:631)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1032)"
            ],
            "state": "TIMED_WAITING",
            "wallTime": 217
        },
        {
            "stackTrace": [
                "sun.misc.Unsafe.park(Native Method)",
                "java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:230)",
                "java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1063)",
                "java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1358)",
                "java.util.concurrent.CountDownLatch.await(CountDownLatch.java:278)",
                "com.alibaba.dingtalk.android.tpn.waitTimeout(SourceFile:???)",
                "com.alibaba.dingtalk.android.tpn.rpc(SourceFile:??)",
                "cyk$5.run(SourceFile:???)",
                "android.os.Handler.handleCallback(Handler.java:938)",
                "android.os.Handler.dispatchMessage(Handler.java:99)",
                "android.os.Looper.loop(Looper.java:257)",
                "android.app.ActivityThread.main(ActivityThread.java:8399)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:631)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1032)"
            ],
            "state": "TIMED_WAITING",
            "wallTime": 11284
        }
    ],
    "type": "HUGE",
    "wallDuration": 12016
}

信息解读

The task is also HUGE and lasted 12 016 ms; the main thread stayed in TIMED_WAITING for about 11 067 ms because it was blocked on CountDownLatch.await.

问题结论

The main thread waited for a child‑thread network request with a timeout up to 30 seconds. In weak‑network conditions this blocks the UI. The original design expected the wait to run in a background thread, but later code moved it to the main thread, creating a hidden ANR risk. Defensive alerts are recommended.

1.3 Dense scenario: Log module fills the message queue

现场信息

{
    "messageStr": "fakeIdle",
    "threadStackList": [
        {
            "stackTrace": [
                "android.os.MessageQueue.enqueueMessage(MessageQueue.java:656)",
                "- locked <185872167> (a android.os.MessageQueue)",
                "android.os.Handler.enqueueMessage(Handler.java:771)",
                "android.os.Handler.sendMessageAtTime(Handler.java:717)",
                "android.os.Handler.sendMessageDelayed(Handler.java:687)",
                "android.os.Handler.post(Handler.java:416)",
                "owy$a.a(SourceFile:???)",
                "owy.a(SourceFile:???)",
                "owy.k(SourceFile:???)",
                "com.alibaba.dingtalk.android.util.Log.d(SourceFile:???)",
                "com.alibaba.dingtalk.android.xxx.onLoad(SourceFile:???)",
                "org.chromium.android_webview.AwContentsClientBridge.onResourceFinishLoad(PG:???)",
                "android.os.MessageQueue.nativePollOnce(Native Method)",
                "android.os.MessageQueue.next(MessageQueue.java:363)",
                "android.os.Looper.loop(Looper.java:176)",
                "android.app.ActivityThread.main(ActivityThread.java:8349)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:513)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1055)"
            ],
            "state": "RUNNABLE",
            "wallTime": 183
        },
        {
            "stackTrace": [
                "android.os.MessageQueue.enqueueMessage(MessageQueue.java:616)",
                "- waiting on <185872167> (a android.os.MessageQueue)",
                "android.os.Handler.enqueueMessage(Handler.java:771)",
                "android.os.Handler.sendMessageAtTime(Handler.java:717)",
                "android.os.Handler.sendMessageDelayed(Handler.java:687)",
                "android.os.Handler.post(Handler.java:416)",
                "owy$a.a(SourceFile:???)",
                "owy.a(SourceFile:???)",
                "owy.k(SourceFile:???)",
                "com.alibaba.dingtalk.android.util.Log.d(SourceFile:???)",
                "com.alibaba.dingtalk.android.mwc.onLoad(SourceFile:???)",
                "org.chromium.android_webview.AwContentsClientBridge.onResourceFinishLoad(PG:???)",
                "android.os.MessageQueue.nativePollOnce(Native Method)",
                "android.os.MessageQueue.next(MessageQueue.java:363)",
                "android.os.Looper.loop(Looper.java:176)",
                "android.app.ActivityThread.main(ActivityThread.java:8349)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:513)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1055)"
            ],
            "state": "BLOCKED",
            "wallTime": 11260
        }
    ],
    "type": "HUGE",
    "wallDuration": 13642
}

信息解读

The task is HUGE and lasted 13 642 ms. Early stacks show the main thread RUNNABLE; later stacks show it BLOCKED while waiting for the lock of MessageQueue, indicating contention caused by high‑frequency log calls.

源码分析

boolean enqueueMessage(Message msg, long when) {
    // 1. request lock
    synchronized (this) {
        // 2. lock acquired
        ...
        // 3. get head of queue
        Message p = mMessages;
        Message prev;
        // 4. insertion sort to find position
        for (;;) {
            prev = p;
            p = p.next;
            if (p == null || when < p.when) {
                // reached end or found spot
                break;
            }
            ...
        }
        // 5. insert message keeping order
        msg.next = p;
        prev.next = msg;
        ...
    }
    // 6. release lock
    return true;
}

问题结论

The high‑frequency log wrapper creates a new Runnable each call, filling the message queue and blocking the main thread. The fix is to reduce log frequency or avoid posting Runnables from hot paths.

1.4 Extreme scenario: Too many emojis in chat box cause freeze

现场信息

{
    "cpuDuration": 4209,
    "messageStr": ">>>>> Dispatching to Handler (com.android.internal.view.IInputConnectionWrapper$MyHandler) {e9507e8} null: 50",
    "threadStackList": [
        {
            "stackTrace": [
                "android.text.DynamicLayout.reflow(DynamicLayout.java:612)",
                "android.text.DynamicLayout$ChangeWatcher.reflow(DynamicLayout.java:1091)",
                "android.text.DynamicLayout$ChangeWatcher.onSpanRemoved(DynamicLayout.java:1116)",
                "android.text.SpannableStringBuilder.sendSpanRemoved(SpannableStringBuilder.java:1296)",
                "android.text.SpannableStringBuilder.removeSpan(SpannableStringBuilder.java:501)",
                "ehk.a(SourceFile:???)",
                "com.alibaba.dingtalk.android.xxx.onTextChanged(SourceFile:???)",
                "android.widget.TextView.handleTextChanged(TextView.java:10722)",
                "android.widget.TextView$ChangeWatcher.onTextChanged(TextView.java:13477)",
                "android.text.SpannableStringBuilder.sendTextChanged(SpannableStringBuilder.java:1267)",
                "android.text.SpannableStringBuilder.replace(SpannableStringBuilder.java:576)",
                "android.view.inputmethod.BaseInputConnection.replaceText(BaseInputConnection.java:869)",
                "android.view.inputmethod.BaseInputConnection.commitText(BaseInputConnection.java:217)",
                "com.android.internal.widget.EditableInputConnection.commitText(EditableInputConnection.java:177)",
                "com.android.internal.view.IInputConnectionWrapper.executeMessage(IInputConnectionWrapper.java:344)",
                "com.android.internal.view.IInputConnectionWrapper$MyHandler.handleMessage(IInputConnectionWrapper.java:89)",
                "android.os.Handler.dispatchMessage(Handler.java:107)",
                "android.os.Looper.loop(Looper.java:237)",
                "android.app.ActivityThread.main(ActivityThread.java:7830)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:492)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1040)"
            ],
            "state": "RUNNABLE",
            "wallTime": 3925
        }
    ],
    "type": "HUGE",
    "wallDuration": 4222
}

信息解读

Two consecutive HUGE tasks each exceed 4 seconds. Stack traces contain ChangeWatcher.onTextChanged and SpannableStringBuilder.removeSpan, indicating heavy emoji processing.

问题结论

When the input method provides more than 200 emojis, the conversion to DingTalk emojis becomes costly, leading to ANR. Define input limits for user‑generated content to avoid such extreme‑scenario performance issues.

1.5 Cross‑process scenario: Reading SharedPreferences across processes is slow

现场信息

{
    "messageStr": ">>>>> Dispatching to Handler (android.app.ActivityThread$H) {2c3df4} android.app.-$Lambda$LoadedApk$ReceiverDispatcher$Args$_BumDX2UKsnxLVrE6UJsJZkotuA@abe477f: 0",
    "threadStackList": [
        {
            "stackTrace": [
                "android.os.BinderProxy.transactNative(Native Method)",
                "android.os.BinderProxy.transact(BinderProxy.java:532)",
                "android.content.ContentProviderProxy.call(ContentProviderNative.java:656)",
                "android.content.ContentResolver.call(ContentResolver.java:2080)",
                "android.content.ContentResolver.call(ContentResolver.java:2060)",
                "ecx.getString(SourceFile:???)",
                "com.alibaba.dingtalk.util.a.a(SourceFile:???)",
                "com.alibaba.dingtalk.android.d$d.onReceive(SourceFile:???)",
                "android.app.LoadedApk$ReceiverDispatcher$Args.lambda$getRunnable$0$LoadedApk$ReceiverDispatcher$Args(LoadedApk.java:1660)",
                "android.app.-$Lambda$LoadedApk$ReceiverDispatcher$Args$_BumDX2UKsnxLVrE6UJsJZkotuA.run(Unknown Source:2)",
                "android.os.Handler.handleCallback(Handler.java:900)",
                "android.os.Handler.dispatchMessage(Handler.java:103)",
                "android.os.Looper.loop(Looper.java:219)",
                "android.app.ActivityThread.main(ActivityThread.java:8349)",
                "java.lang.reflect.Method.invoke(Native Method)",
                "com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:513)",
                "com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1055)"
            ],
            "state": "RUNNABLE",
            "wallTime": 11331
        }
    ],
    "type": "HUGE",
    "wallDuration": 12434
}

信息解读

The task is HUGE and lasted about 12 434 ms. Multiple stacks show the same cross‑process call chain, indicating the child process had to start the main process to read SharedPreferences via a ContentProvider.

问题结论

The cross‑process SharedPreferences implementation runs in the main process; when the main process is not alive, the child process must launch it, causing a long delay and ANR. Reduce unnecessary cross‑process operations and keep such calls off the UI thread.

1.6 Summary

System ANR traces often lack precise information, while ANRCanary adds task duration, thread state, lock details, and historical stacks, enabling faster root‑cause identification.

2. Tool Construction

2.1 Deep‑dive root cause

Building ANRCanary required addressing the shortcomings of native ANR traces, such as poor analyzability and aggregation errors, by monitoring main‑thread historical tasks and filtering false‑positive SIGQUIT signals.

2.2 High availability

Reliability is critical; extensive unit testing (203 JUnit cases, 86 AndroidJUnit cases) and monitoring metrics (main‑thread message success rate, IdleHandler success rate, deadlock detection rate) ensure the tool’s trustworthiness.

2.3 Usability

Beyond detecting long tasks, ANRCanary captures stack traces, aligns sampling to avoid extra overhead, and provides an attribution summary (e.g., task index, duration, signature) to quickly locate problematic tasks.

// ANR attribution summary
"anrReasonInfo":{
    "extra":{
        "duration":832,
        "index":14,
        "message":">>>>> Dispatching to Handler (elz) {943d152} itl$3@3134c23: 0"
    },
    "signature":"F:huge:elz|itl$*|0",
    "type":"HUGE"
}

2.4 Real‑world validation

Each MVP undergoes gray‑release testing, uncovering issues such as Barrier message leaks, Freeze tasks caused by long SharedPreferences reads, and FakeIdle tasks hidden from LooperPrinter, leading to continuous improvements.

2.5 Summary

Tool development is endless; deep system understanding and user‑centric iteration are essential for valuable, usable tooling.

3. Best Practices

3.1 Observer pattern pitfalls

When observers run on the main thread, avoid heavy IO; choose the notification thread based on whether observers mainly update UI or perform IO.

3.2 Dense scenarios

Be aware of high‑frequency code paths (UI drawing, touch events, list scrolling, logging) and keep heavy CPU/IO work off the main thread.

3.3 Cross‑process scenarios

Minimize cross‑process dependencies, keep long‑lived processes as services, and never perform cross‑process calls on the UI thread.

3.4 Extreme scenarios

Set limits on user‑generated content (e.g., max 100 emojis, max 100 lines of chat) and avoid unbounded SharedPreferences growth.

3.5 StrictMode

Enable StrictMode to catch main‑thread disk and network access; use StrictMode.noteSlowCall for custom slow‑function warnings.

3.6 Summary

Performance issues often stem from overlooked code paths; adhering to these practices helps developers avoid many ANR‑inducing bugs.

4. Conclusion

ANRCanary enriches ANR reports with detailed task timing, stack traces, thread states, and lock information, turning previously opaque ANRs into traceable problems. Ongoing effort is needed to handle the myriad ways the main thread can be blocked.

References

MessageQueue.java: https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/os/MessageQueue.java

AndroidANRTooling
Alibaba Terminal Technology
Written by

Alibaba Terminal Technology

Official public account of Alibaba Terminal

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.