Automated Test Generation and CI Configuration for Airbnb Android
The final Airbnb Android testing article explains how a Kotlin script automatically generates JUnit tests for every MvRx Fragment, integrates them into a Buildkite CI pipeline that runs only changed fragment tests, shards execution with Flank, reports visual regressions via Happo, aggregates JaCoCo coverage, and posts detailed PR comments, while outlining future enhancements such as deep‑link, end‑to‑end, performance, and R8 testing.
In the seventh and final article of the Airbnb automated testing framework series, the author dives into CI configuration and the direction of future development.
The article explains how test files are generated automatically using a Kotlin script that parses the project to find all MvRx Fragments, builds an AST with the Kotlin compiler, and writes JUnit test files into the androidTest source directory via KotlinPoet.
@LargeTest
class MvrxIntegrationTests : MvRxIntegrationTestBase() {
@Test
fun booking_fragments_BookingFragment_Screenshots() {
runScreenshots("com.airbnb.android.booking.fragments.BookingFragment")
}
@Test
fun booking_fragments_BookingFragment_Interactions() {
runInteractionTest("com.airbnb.android.booking.fragments.BookingFragment")
}
// More Fragments declared below...
}Generated tests are named after the fully‑qualified Fragment name, and the base class supplies the runtime logic, keeping the generated files lightweight.
The code‑generation approach offers several benefits: eliminating manual test maintenance, enabling lightweight test sharding, and making it easy to add new test types.
Design decisions include collecting only Fragment names (not each mock variant) and implementing a mock‑grouping system to keep test execution time reasonable.
Another snippet shows how mock providers are combined:
override fun provideMocks() = combineMocks(
"Marketplace" to marketplaceMocks(),
"Plus" to plusMocks(),
"Plus ProHost" to plusProHostMocks()
)The CI pipeline runs on every PR in the GitHub repository using Buildkite. It first runs the test‑generation script, then determines which Fragments have changed and runs tests only for those, reducing CI time and Firebase costs.
To handle occasional Firebase Test Lab outages, the pipeline queries the Firebase status API and posts a comment on the PR with a link to the incident and guidance for developers.
For large test suites, the open‑source tool Flank is used to shard tests across multiple Firebase test matrices, cutting total execution time from hours to minutes.
After tests finish, the pipeline uploads Flank output as Buildkite artifacts, extracts failed test matrices from the JUnit report, and posts a PR comment with links to the failures.
Visual regression is performed with Happo; the pipeline generates a Happo report for the PR and compares it with the master branch, posting a comment if differences are found.
Code coverage is collected with JaCoCo, combined from integration and unit test runs, and posted back to the PR.
Finally, the pipeline posts a summary comment to the PR, updating or deleting previous comments as needed via a custom GitHub API wrapper.
The article concludes with future improvement ideas such as automated deep‑link testing, end‑to‑end tests that hit production APIs, custom Espresso tests, performance benchmarks with Jetpack Benchmark, and testing of R8/ProGuard‑optimized builds.
The testing framework is being open‑sourced as an extension of the MvRx library starting from version 2.0.0‑alpha.
Airbnb Technology Team
Official account of the Airbnb Technology Team, sharing Airbnb's tech innovations and real-world implementations, building a world where home is everywhere through technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.