Mobile Development 17 min read

How to Accurately Detect UI Lag and ANR on Android: Advanced Monitoring Techniques

This article explains the relationship between UI stutter and ANR, critiques common monitoring tools, and presents three robust Android lag‑detection methods—WatchDog polling, Looper Printer replacement, and specialized handlers for IdleHandler, TouchEvent, and SyncBarrier leaks—complete with probability analysis and sample code.

WeChat Client Technology Team
WeChat Client Technology Team
WeChat Client Technology Team
How to Accurately Detect UI Lag and ANR on Android: Advanced Monitoring Techniques

1. WatchDog

The essence of this scheme is to start a child thread that continuously polls the UI thread, sending a Message at regular intervals and checking whether the Message is processed; if not, the main thread is considered blocked. This simple, universal approach works on iOS and other client systems.

Advantages: Simple, stable, result‑oriented; can detect various types of stutter.

Disadvantages: Polling is inelegant, can produce false positives and random missed detections; the polling interval trade‑off is critical—short intervals hurt performance, long intervals increase miss rates.

For example, using a 4.5‑second interval, the probability of detecting a 5‑second stall is only 11 %. To guarantee detection of a stall longer than twice the interval, the interval must be less than half the stall duration.

When the interval is reduced and multiple consecutive missed Messages are required (e.g., three consecutive 2‑second intervals without processing), the detection probability can be calculated accordingly.

for (;;) {</code><code>    Message msg = queue.next(); // might block</code><code>    final Printer logging = me.mLogging;</code><code>    if (logging != null) {</code><code>        logging.println(">>>> Dispatching to " + msg.target + " " + msg.callback + ": " + msg.what);</code><code>    }</code><code>    msg.target.dispatchMessage(msg);</code><code>    if (logging != null) {</code><code>        logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);</code><code>    }</code><code>}

2. Looper Printer

This method replaces the main thread Looper’s Printer to monitor the execution time of dispatchMessage. It is widely used in production (e.g., WeChat’s Matrix) and has been stable for a long time.

Advantages: No random missed detections, no polling required, one‑time setup.

Disadvantages: Certain types of stutter cannot be captured.

The core idea is to record timestamps before and after dispatchMessage via a custom printer and compute the duration. However, this method cannot monitor stalls that occur in MessageQueue.next() (which may block) or in TouchEvent handling, because those happen before dispatchMessage is invoked.

for (;;) {</code><code>    Message msg = queue.next(); // might block</code><code>    final Printer logging = me.mLogging;</code><code>    if (logging != null) {</code><code>        logging.println(">>>> Dispatching to " + msg.target + " " + msg.callback + ": " + msg.what);</code><code>    }</code><code>    msg.target.dispatchMessage(msg);</code><code>    if (logging != null) {</code><code>        logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);</code><code>    }</code><code>}

3. Complete Lag Monitoring Scheme

3.1 Monitoring IdleHandler Stutter

By reflecting the private MessageQueue.mIdleHandlers list and replacing it with a custom MyArrayList, we can intercept every IdleHandler.queueIdle() call and measure its execution time.

private static void detectIdleHandler() {</code><code>    try {</code><code>        MessageQueue mainQueue = Looper.getMainLooper().getQueue();</code><code>        Field field = MessageQueue.class.getDeclaredField("mIdleHandlers");</code><code>        field.setAccessible(true);</code><code>        MyArrayList<MessageQueue.IdleHandler> myIdleHandlerArrayList = new MyArrayList<>();</code><code>        field.set(mainQueue, myIdleHandlerArrayList);</code><code>    } catch (Throwable t) {</code><code>        t.printStackTrace();</code><code>    }</code><code>}</code><code>static class MyArrayList<T> extends ArrayList {</code><code>    Map<MessageQueue.IdleHandler, MyIdleHandler> map = new HashMap<>();</code><code>    @Override</code><code>    public boolean add(Object o) {</code><code>        if (o instanceof MessageQueue.IdleHandler) {</code><code>            MyIdleHandler myIdleHandler = new MyIdleHandler((MessageQueue.IdleHandler) o);</code><code>            map.put((MessageQueue.IdleHandler) o, myIdleHandler);</code><code>            return super.add(myIdleHandler);</code><code>        }</code><code>        return super.add(o);</code><code>    }</code><code>    @Override</code><code>    public boolean remove(@Nullable Object o) {</code><code>        if (o instanceof MyIdleHandler) {</code><code>            MessageQueue.IdleHandler idleHandler = ((MyIdleHandler) o).idleHandler;</code><code>            map.remove(idleHandler);</code><code>            return super.remove(o);</code><code>        } else {</code><code>            MyIdleHandler myIdleHandler = map.remove(o);</code><code>            if (myIdleHandler != null) {</code><code>                return super.remove(myIdleHandler);</code><code>            }</code><code>            return super.remove(o);</code><code>        }</code><code>    }</code><code>}

3.2 Monitoring TouchEvent Stutter

Touch events travel from the InputDispatcher (server) to the client UI thread via a socket. By PLT‑hooking the native recvfrom and sendto functions in libinput.so, we can measure the time between receiving a Touch event and its consumption; a large gap indicates a Touch‑event‑related stall.

3.3 Monitoring SyncBarrier Leaks

SyncBarrier leaks can be detected by periodically inspecting the main Looper’s MessageQueue.mMessages. A SyncBarrier appears as a Message with a null target that has existed for a long time. By sending a synchronous and an asynchronous message and observing which one gets processed, we can infer a leak and optionally remove it via reflection.

MessageQueue mainQueue = Looper.getMainLooper().getQueue();</code><code>Field field = mainQueue.getClass().getDeclaredField("mMessages");</code><code>field.setAccessible(true);</code><code>Message mMessage = (Message) field.get(mainQueue);</code><code>if (mMessage != null) {</code><code>    long when = mMessage.getWhen() - SystemClock.uptimeMillis();</code><code>    if (when < -3000 && mMessage.getTarget() == null) { // SyncBarrier</code><code>        int token = mMessage.arg1;</code><code>        startCheckLeaking(token);</code><code>    }</code><code>}</code><code>private static void startCheckLeaking(int token) {</code><code>    int checkCount = 0;</code><code>    while (checkCount < CHECK_STRICTLY_MAX_COUNT) {</code><code>        checkCount++;</code><code>        int latestToken = getSyncBarrierToken();</code><code>        if (token != latestToken) break;</code><code>        if (DetectSyncBarrierOnce()) {</code><code>            removeSyncBarrier(token);</code><code>            break;</code><code>        }</code><code>        try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); }</code><code>    }</code><code>}</code><code>private static void removeSyncBarrier(int token) throws Exception {</code><code>    MessageQueue mainQueue = Looper.getMainLooper().getQueue();</code><code>    Method method = mainQueue.getClass().getDeclaredMethod("removeSyncBarrier", int.class);</code><code>    method.setAccessible(true);</code><code>    method.invoke(mainQueue, token);</code><code>}

These three extensions—IdleHandler timing, TouchEvent socket hooking, and SyncBarrier leak detection—complement the Looper Printer approach, forming a comprehensive Android UI lag monitoring solution.

AndroidPerformance MonitoringANRLooperWatchdogUI lagIdleHandler
WeChat Client Technology Team
Written by

WeChat Client Technology Team

Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.