Mobile Development 11 min read

Layered Performance Testing for Kun (Flutter‑Based) Mobile Container

To ensure Kun’s Flutter‑based container upgrades improve Xianyu’s user experience, we introduced a layered performance testing framework that defines industry‑standard metrics, component‑ and business‑level benchmarks, multi‑dimensional stress and regression checks, automated scripts, expanded device pools, and streamlined task creation, delivering faster insights, higher stability, and actionable optimization feedback.

Xianyu Technology
Xianyu Technology
Xianyu Technology
Layered Performance Testing for Kun (Flutter‑Based) Mobile Container

Currently many Xianyu services are migrating from H5/Weex to Kun, a hybrid high‑performance container built on W3C standards and Flutter. From a testing perspective, it is crucial to ensure that this upgrade improves user experience, making robust performance testing essential.

In the past, performance checks were only triggered when a problem appeared in production, which made it hard to locate the root cause. This year we introduced a container‑layered performance testing approach to surface risks early, especially considering both the container and the Flutter engine impact.

What to Measure

We aligned our metrics with industry standards and competitor apps, choosing widely accepted performance indicators rather than custom developer‑specific points. The key metrics include page load time, frame rate, memory usage, and crash rates, as illustrated in the diagram below.

We identified three main tasks: (1) Align Flutter engine and Kun page measurement standards; (2) Build benchmark scenarios at component and core‑business levels; (3) Conduct regression testing to evaluate optimization results.

Layered Performance Scenarios

Based on the layered strategy, we defined benchmark scenes that focus on component‑level and core‑business performance, then validate the overall impact on the Xianyu app.

Multi‑Dimensional Metrics

We adopt several dimensions to ensure stable performance: extreme stress testing on low‑end devices, horizontal comparison of pre‑ and post‑migration versions, version‑wise regression, and a fast‑feedback benchmark platform that isolates network variability.

Extreme stress: detect aborts or crashes on mid/low‑end phones.

Horizontal comparison: compare old and new tech stacks.

Version comparison: track metric fluctuations across releases.

Benchmark tests: mock data to quickly verify component‑level changes.

Testing Scripts

We refined scripts to simulate fast and slow swipe gestures, which significantly affect performance data. The recordings below show the two scenarios.

Fast swipe

Slow swipe

We also provide a common script library that abstracts environment variables, performance collection, screen recording, and frame extraction, allowing users to run tests with minimal configuration.

Infrastructure Improvements

To address device queueing, we expanded the device pool from 8 to 16 devices covering low, medium, and high tiers, and introduced device grouping with automatic fallback to similar models.

We introduced task templates to simplify task creation; users only need to specify the app package URL while the underlying script handles all parameters.

Result comparison was enhanced to allow side‑by‑side analysis of multiple reports, making it easy for developers to verify the effectiveness of optimizations.

Outcomes

Reduced manual effort: task creation time dropped from 10 minutes to 2‑3 minutes.

High stability: 94 % task success rate with alleviated device queueing.

Convenient data analysis: developers can directly compare experiment data.

Effective layered validation: benchmark templates enable rapid verification of optimizations.

Deeper issue insight: identified and addressed Flutter performance data collection gaps.

Future Work

Enrich benchmark suites for finer‑grained impact assessment.

Reduce test waiting time to improve efficiency.

Define publishing standards to ensure quality of new technologies.

Shift performance testing earlier in the release cycle.

Explore advanced problem‑location capabilities.

Fluttermobile developmentAutomationPerformance TestingbenchmarkKun container
Xianyu Technology
Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.