Design and Implementation of an AVD IaaS Testing Platform for Mobile Automation at Ctrip
The article details Ctrip's development of a cloud‑native AVD IaaS platform that leverages Kubernetes, Docker, and Android Virtual Devices to provide scalable, cost‑effective, and high‑availability mobile automated testing infrastructure, addressing hardware costs, performance, and operational challenges while supporting continuous integration pipelines.
Background
Under Ctrip's rapid agile release cycle, the quality of online production services depends on improving automated test coverage, execution frequency, and speed. Common pain points across front‑end teams include high procurement costs for test phones, maintenance difficulties, and limited scalability of test execution.
IaaS Layer Issues
Expensive test phone procurement and frequent hardware replacement.
Diverse device models cause complex management and battery safety risks.
Limited concurrency and stability of test execution lead to bottlenecks during large releases.
Impact on Development
Low automation adoption due to time‑consuming failure analysis and retries.
Long total test run times delay DevOps feedback loops.
Reliance on manual testing increases labor costs and error risk.
Project Goals
Re‑architect the IaaS foundation to reduce procurement and operational costs, improve test task reliability and performance, and provide a generic testing device service that seamlessly supports multiple test frameworks across the company.
System Selection
Three options were evaluated: public‑cloud real‑device farms, private‑cloud real‑device clusters, and private‑cloud virtual‑device clusters. The chosen solution is a private‑cloud virtual‑device (AVD) cluster built on Docker containers orchestrated by Kubernetes.
Architecture
The platform consists of three layers:
Container Instance Layer: AVD containers run on Kubernetes nodes with nested KVM support, privileged security context, and fixed IP allocation. Each container includes the Android emulator, drivers, and required system tools.
Scheduling Management Layer: REST APIs (implemented with Spring Boot) provide device list, allocation, scaling, and lifecycle operations.
Operation Layer: Users interact via a web GUI or CLI. Example CLI commands: avd devices adb connect device_ip_address:5555 adb devices
High‑Availability Measures
Multi‑node deployment to avoid single points of failure.
Cache layer to mitigate database outages.
Comprehensive monitoring of CPU, memory, API latency, and device pool metrics.
Automatic scaling of device resources based on queue length.
Idle device reclamation and automated health‑check‑driven restarts.
Challenges and Solutions
Running ARM‑compiled apps on x86 hosts caused performance penalties; adopting Android 11 images with ARM compatibility eliminated the need for x86 recompilation and improved performance.
Adoption and Impact
Since 2020, the AVD IaaS platform has been integrated into Ctrip's CTest middle‑platform, serving over 15 business units and more than 10,000 test executions. Dynamic scaling reduced average test execution time by 74% and enabled two weekly releases for the flight‑ticket front‑end team.
Conclusion
The AVD IaaS system now supports most of Ctrip's mobile testing needs, offering a stable, scalable, and cost‑effective solution while continuing to evolve with new features such as BDD frameworks and traffic replay.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.