Why Your Android App’s Network Requests Stall and How RUM Reveals the Fix
The article explains how mobile network performance issues—especially long connection‑pool wait times—cause slow page loads in Android apps, demonstrates how to interpret Alibaba Cloud RUM’s Resource event metrics and timing data, walks through a real‑world case study with detailed stage‑by‑stage analysis, and provides concrete diagnostic steps and optimization recommendations for OkHttp connection‑pool configuration and other common bottlenecks.
Background
In mobile applications network latency often causes user‑perceived slowness. Mobile networks involve multiple radio standards, frequent signal changes, and a wide range of device capabilities, making performance troubleshooting difficult because traditional monitoring only reports total request duration.
RUM SDK Data Model
Alibaba Cloud CMS 2.0 Real‑User Monitoring (RUM) Android SDK defines a standardized Resource event that aligns with HTTP and the W3C Performance Timing API. The event contains:
Attribute fields – request URL, method, status, etc.
Metric fields – fine‑grained timings such as DNS, TCP, SSL, TTFB, and response phases.
These metrics enable cross‑platform performance comparison (Web, iOS, Android, HarmonyOS).
Real‑World Case Study
A popular app reported occasional >1 s response times for a core API while backend logs showed ~400 ms server processing. Enabling RUM collected the following raw timing data (nanoseconds):
{
"requestHeadersEnd":1560814315115219,
"responseBodyStart":1560814719308917,
"requestType":"OkHttp3",
"connectionAcquired":1560814312934751,
"connectionReleased":1560814721700948,
"requestBodyEnd":1560814315850323,
"responseHeadersEnd":1560814718722250,
"requestHeadersStart":1560814312975011,
"responseBodyEnd":1560814719441625,
"requestBodyStart":1560814315146573,
"callEnd":1560814721840948,
"duration":1232825780,
"callStart":1560813486615845,
"responseHeadersStart":1560814718314125
}Key observations :
No DNS/TCP/SSL callbacks → the request reused an existing connection.
The interval callStart → connectionAcquired lasted 826 ms (≈67 % of total latency).
All other stages (header/body send, TTFB, response receive) were under a few milliseconds.
Conclusion: the bottleneck was the connection‑pool wait time, not network transmission or server processing.
Stage‑by‑Stage Timing
callStart → connectionAcquired
耗时: (1560814312934751 - 1560813486615845) / 1,000,000 = 826.32 ms ⚠️ requestHeadersStart → requestHeadersEnd
耗时: (1560814315115219 - 1560814312975011) / 1,000,000 = 2.14 ms ✅ requestBodyStart → requestBodyEnd
耗时: (1560814315850323 - 1560814315146573) / 1,000,000 = 0.70 ms ✅ requestBodyEnd → responseHeadersStart
耗时: (1560814718314125 - 1560814315850323) / 1,000,000 = 402.46 ms responseHeadersStart → responseHeadersEnd
耗时: (1560814718722250 - 1560814718314125) / 1,000,000 = 0.41 ms ✅ responseBodyStart → responseBodyEnd
耗时: (1560814719441625 - 1560814719308917) / 1,000,000 = 0.13 ms ✅ responseBodyEnd → connectionReleased
耗时: (1560814721700948 - 1560814719441625) / 1,000,000 = 2.26 ms ✅The analysis confirms that the connection‑pool wait dominates the total 1.23 s request duration.
Diagnostic Steps
Inspect the OkHttpClient connection‑pool configuration (default: max 5 idle connections, 5‑minute keep‑alive).
Check the number of concurrent requests to the same host via the RUM console.
Ensure every Response body is closed to avoid connection leaks.
// View current pool configuration
ConnectionPool pool = okHttpClient.connectionPool();
// Default: 5 idle connections, 5 min keep‑alive // Ensure response is closed
Response response = client.newCall(request).execute();
try {
String body = response.body().string();
// process
} finally {
response.close(); // mandatory
}Optimization Recommendations
Increase maxIdleConnections for high‑concurrency apps (e.g., 30‑50).
Raise maxRequestsPerHost if many parallel requests target the same host.
Use custom DNS or HTTPDNS to reduce DNS latency.
Enable SSL session reuse or extend connection‑pool keep‑alive time.
// Example: enlarge connection pool
new OkHttpClient.Builder()
.connectionPool(new ConnectionPool(30, 5, TimeUnit.MINUTES))
.dispatcher(new Dispatcher() {{ setMaxRequestsPerHost(10); }})
.build();Common Network‑Performance Problems & Diagnosis
Connection‑pool wait too long – callStart → connectionAcquired > 500 ms.
Slow DNS resolution – resource.dns_duration > 500 ms.
High SSL handshake time – resource.ssl_duration > 1 s.
Excessive TTFB – resource.first_byte_duration > 2 s.
Sample RUM console queries (SQL‑style) can be used to isolate each case, and the corresponding code snippets illustrate how to log or mitigate the issue.
Final Takeaways
By leveraging RUM’s fine‑grained metrics, developers can transform vague “slow request” complaints into precise diagnoses such as “connection‑pool wait of 826 ms”. Adjusting OkHttp connection‑pool parameters based on traffic patterns (high‑concurrency, normal, or low‑frequency) resolves the issue. RUM also provides continuous real‑user monitoring, alerting, and data‑driven performance optimization across Android, Web, iOS, and HarmonyOS platforms.
Key resources:
RUM Android SDK documentation: https://help.aliyun.com/zh/arms/user-experience-monitoring/monitor-android-applications
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Observability
Driving continuous progress in observability technology!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
