Mobile Development 13 min read

Building Offline Mobile Performance Monitoring with AWACS and APM

This article explains how Youzan extended its APM framework with offline monitoring, built the AWACS visual tool, integrated Appium‑driven regression, instrumented method timing and network traffic via Gradle plugins, captured page rendering time, processed data in the backend, and created an issue‑management platform, outlining future enhancements.

Youzan Coder

Feb 24, 2021

Building Offline Mobile Performance Monitoring with AWACS and APM

Introduction

Mobile business complexity leads developers to overlook performance problems. The existing Youzan APM only collects online data, so offline monitoring is required during QA and development to catch regressions before release.

Architecture Design

The offline monitoring capability extends the APM framework. A visual tool named AWACS adds a global floating window that shows real‑time alerts (pop‑ups and Toast). QA integrates Appium UI‑flow test cases, and an automated regression suite runs the same flow on a fixed device for each app version, enabling stage‑level performance comparison. Detected issues are pushed to a WeChat robot and recorded in an mPaaS‑based issue‑management board.

Monitoring Metrics Analysis

Stage Data

Each business flow (e.g., app launch, add‑to‑cart) is treated as a "stage". Two dimensions are collected: method‑duration and network‑status.

Method‑duration analysis

A custom Gradle plugin instruments bytecode during the Transform stage. Using ASM, it inserts calls to MethodBeat.i at method entry and MethodBeat.o at exit, automatically measuring execution time for every method.

public Response intercept(Chain chain) throws IOException {
    Request request = chain.request();
    Response response = null;
    String url = getRequestUrl(request.url());
    if (!TextUtils.isEmpty(url)) {
        AppSegmentCache.INSTANCE.setRequestStart(url);
        long startNs = System.nanoTime();
        try {
            response = chain.proceed(request);
        } catch (Exception e) {
            throw e;
        }
        long tookMs = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
        AppSegmentCache.INSTANCE.setRequestEnd(url, tookMs);
    }
    return chain.proceed(request);
}

All stage data from every app version are uploaded to a backend service. A scheduled task classifies changes into four categories – new, reduced, sharp increase, sharp decrease – to highlight abnormal regressions (e.g., launch stage).

Network‑status analysis

A custom OkHttp interceptor records request count and latency per stage. The same instrumentation code as above is used to capture start/end timestamps and compute request duration.

public Response intercept(Chain chain) throws IOException {
    Request request = chain.request();
    Response response = null;
    String url = getRequestUrl(request.url());
    if (!TextUtils.isEmpty(url)) {
        AppSegmentCache.INSTANCE.setRequestStart(url);
        long startNs = System.nanoTime();
        try {
            response = chain.proceed(request);
        } catch (Exception e) {
            throw e;
        }
        long tookMs = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
        AppSegmentCache.INSTANCE.setRequestEnd(url, tookMs);
    }
    return chain.proceed(request);
}

Two statistical dimensions are produced: "total calls" (compared with the previous version to compute trend percentages) and "repeated calls" (listing duplicate URLs for optimization).

Traffic

Both HttpUrlConnection and OkHttp are hooked during the .class‑to‑.dex transform. Every request passes through the hook, allowing measurement of request size, response length (including gzip decompression), and writing of the raw traffic to a file for offline analysis.

internal object OkHttpHook {
    @JvmField
    public val globalNetworkInterceptor = Interceptor { chain ->
        // calculate response length
        // read response content (decompress if gzip)
        // write traffic info to file
        val fileUrl = File(file, URLEncoder.encode(
            SimpleDateFormat("yyyy-MM-dd-HH:mm:ss-SSS").format(Date()) + "-" + netPackInfo.url))
        fileUrl.writeText(netPackInfo.toString())
        // ...
    }
}

public object HttpUrlConnectHook {
    @JvmStatic
    fun proxy(httpUrlConnection: URLConnection): URLConnection {
        try {
            return hookOkHttpURLConnection(httpUrlConnection)
        } catch (e: Exception) {
            e.printStackTrace()
        }
        return urlConnection
    }

    @Throws(Exception::class)
    private fun hookOkHttpURLConnection(httpUrlConnection: URLConnection): URLConnection {
        val builder = OkHttpClient.Builder()
        val mClient = builder.retryOnConnectionFailure(true).build()
        val strUrl = httpUrlConnection.url.toString()
        val url = URL(strUrl)
        val protocol = url.protocol.lowercase(Locale.ROOT)
        return if (protocol.startsWith("http", ignoreCase = true)) {
            HttpUrlFactory.OkHttpURLConnection(url, mClient)
        } else urlConnection
    }
}

Page‑time

For B‑end cash‑register devices, UI smoothness is critical. The system monitors Activity and

Fragment

onCreate

as the start point and the first onDraw callback as the end point, computing rendering time.

public void watchActivity(Activity activity) {
    watchWithMonitorView(activity.getClass().getName(), activity.getWindow().getDecorView());
    if (activity instanceof FragmentActivity) {
        ((FragmentActivity) activity).getSupportFragmentManager()
            .registerFragmentLifecycleCallbacks(new FragmentLifecycleCallbacks() {
                public void onFragmentViewCreated(FragmentManager fm, Fragment f, View v, Bundle savedInstanceState) {
                    watchWithMonitorView(f.getClass().getName(), v);
                }
            }, true);
    }
}

During automated regression, a page is flagged as a valid slowdown only if its rendering time exceeds 200 ms in at least three runs, reducing false positives caused by hardware variance.

Backend Issue Analysis

Performance data collected from each regression run is aggregated nightly. The backend computes average stage duration for the current and previous versions, compares them, and flags deviations that exceed a configurable threshold as actionable issues. These issues are stored in a database and exposed via the mPaaS UI.

Offline AWACS Tool

In QA and development builds, a floating alert icon appears on the app. Tapping the icon opens the performance monitoring center, which displays stage metrics, ANR events, slow‑method traces, traffic statistics, and FPS data for rapid diagnosis.

Issue Management and Assignment Platform

Validated issues are listed on a board where users can filter by metric, app, status, and environment. Each entry shows occurrence count, device information, and allows status changes or assignment to owners.

Future Plans

Monitor additional dimensions such as CPU usage, thread activity, and UI updates from background threads.

Expand automated test cases to cover more performance scenarios beyond the main UI flow.

Increase the variety of test devices to enable multi‑model performance analysis.

Roll out the solution to other Youzan applications.

mobile development Android APM Performance Monitoring automation testing issue tracking

Written by

Youzan Coder

Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.