Building Offline Mobile Performance Monitoring with AWACS and APM
This article explains how Youzan extended its APM framework with offline monitoring, built the AWACS visual tool, integrated Appium‑driven regression, instrumented method timing and network traffic via Gradle plugins, captured page rendering time, processed data in the backend, and created an issue‑management platform, outlining future enhancements.
Introduction
Mobile business complexity leads developers to overlook performance problems. The existing Youzan APM only collects online data, so offline monitoring is required during QA and development to catch regressions before release.
Architecture Design
The offline monitoring capability extends the APM framework. A visual tool named AWACS adds a global floating window that shows real‑time alerts (pop‑ups and Toast). QA integrates Appium UI‑flow test cases, and an automated regression suite runs the same flow on a fixed device for each app version, enabling stage‑level performance comparison. Detected issues are pushed to a WeChat robot and recorded in an mPaaS‑based issue‑management board.
Monitoring Metrics Analysis
Stage Data
Each business flow (e.g., app launch, add‑to‑cart) is treated as a "stage". Two dimensions are collected: method‑duration and network‑status.
Method‑duration analysis
A custom Gradle plugin instruments bytecode during the Transform stage. Using ASM, it inserts calls to MethodBeat.i at method entry and MethodBeat.o at exit, automatically measuring execution time for every method.
public Response intercept(Chain chain) throws IOException {
Request request = chain.request();
Response response = null;
String url = getRequestUrl(request.url());
if (!TextUtils.isEmpty(url)) {
AppSegmentCache.INSTANCE.setRequestStart(url);
long startNs = System.nanoTime();
try {
response = chain.proceed(request);
} catch (Exception e) {
throw e;
}
long tookMs = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
AppSegmentCache.INSTANCE.setRequestEnd(url, tookMs);
}
return chain.proceed(request);
}All stage data from every app version are uploaded to a backend service. A scheduled task classifies changes into four categories – new, reduced, sharp increase, sharp decrease – to highlight abnormal regressions (e.g., launch stage).
Network‑status analysis
A custom OkHttp interceptor records request count and latency per stage. The same instrumentation code as above is used to capture start/end timestamps and compute request duration.
public Response intercept(Chain chain) throws IOException {
Request request = chain.request();
Response response = null;
String url = getRequestUrl(request.url());
if (!TextUtils.isEmpty(url)) {
AppSegmentCache.INSTANCE.setRequestStart(url);
long startNs = System.nanoTime();
try {
response = chain.proceed(request);
} catch (Exception e) {
throw e;
}
long tookMs = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - startNs);
AppSegmentCache.INSTANCE.setRequestEnd(url, tookMs);
}
return chain.proceed(request);
}Two statistical dimensions are produced: "total calls" (compared with the previous version to compute trend percentages) and "repeated calls" (listing duplicate URLs for optimization).
Traffic
Both HttpUrlConnection and OkHttp are hooked during the .class‑to‑.dex transform. Every request passes through the hook, allowing measurement of request size, response length (including gzip decompression), and writing of the raw traffic to a file for offline analysis.
internal object OkHttpHook {
@JvmField
public val globalNetworkInterceptor = Interceptor { chain ->
// calculate response length
// read response content (decompress if gzip)
// write traffic info to file
val fileUrl = File(file, URLEncoder.encode(
SimpleDateFormat("yyyy-MM-dd-HH:mm:ss-SSS").format(Date()) + "-" + netPackInfo.url))
fileUrl.writeText(netPackInfo.toString())
// ...
}
} public object HttpUrlConnectHook {
@JvmStatic
fun proxy(httpUrlConnection: URLConnection): URLConnection {
try {
return hookOkHttpURLConnection(httpUrlConnection)
} catch (e: Exception) {
e.printStackTrace()
}
return urlConnection
}
@Throws(Exception::class)
private fun hookOkHttpURLConnection(httpUrlConnection: URLConnection): URLConnection {
val builder = OkHttpClient.Builder()
val mClient = builder.retryOnConnectionFailure(true).build()
val strUrl = httpUrlConnection.url.toString()
val url = URL(strUrl)
val protocol = url.protocol.lowercase(Locale.ROOT)
return if (protocol.startsWith("http", ignoreCase = true)) {
HttpUrlFactory.OkHttpURLConnection(url, mClient)
} else urlConnection
}
}Page‑time
For B‑end cash‑register devices, UI smoothness is critical. The system monitors Activity and
Fragment onCreateas the start point and the first onDraw callback as the end point, computing rendering time.
public void watchActivity(Activity activity) {
watchWithMonitorView(activity.getClass().getName(), activity.getWindow().getDecorView());
if (activity instanceof FragmentActivity) {
((FragmentActivity) activity).getSupportFragmentManager()
.registerFragmentLifecycleCallbacks(new FragmentLifecycleCallbacks() {
public void onFragmentViewCreated(FragmentManager fm, Fragment f, View v, Bundle savedInstanceState) {
watchWithMonitorView(f.getClass().getName(), v);
}
}, true);
}
}During automated regression, a page is flagged as a valid slowdown only if its rendering time exceeds 200 ms in at least three runs, reducing false positives caused by hardware variance.
Backend Issue Analysis
Performance data collected from each regression run is aggregated nightly. The backend computes average stage duration for the current and previous versions, compares them, and flags deviations that exceed a configurable threshold as actionable issues. These issues are stored in a database and exposed via the mPaaS UI.
Offline AWACS Tool
In QA and development builds, a floating alert icon appears on the app. Tapping the icon opens the performance monitoring center, which displays stage metrics, ANR events, slow‑method traces, traffic statistics, and FPS data for rapid diagnosis.
Issue Management and Assignment Platform
Validated issues are listed on a board where users can filter by metric, app, status, and environment. Each entry shows occurrence count, device information, and allows status changes or assignment to owners.
Future Plans
Monitor additional dimensions such as CPU usage, thread activity, and UI updates from background threads.
Expand automated test cases to cover more performance scenarios beyond the main UI flow.
Increase the variety of test devices to enable multi‑model performance analysis.
Roll out the solution to other Youzan applications.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
