How Airbnb Automates Android Interaction Testing with JSON Diff
Airbnb’s Android team built an automated interaction testing system that traverses view hierarchies, records click actions and resulting state changes into JSON snapshots, enabling diff-based verification of UI behavior without writing traditional Espresso tests, and integrates with Happo for CI reporting.
Background and Motivation
In the second part of Airbnb’s Android automation series, screenshot testing was introduced to catch visual regressions. However, screenshot tests cannot verify code that handles user interactions such as click events, which constitute a large portion of the app’s logic and are frequent sources of bugs.
Limitations of Traditional Espresso Tests
Views must be located by ID or position, which can change across releases.
Scrolling is required for views inside scrollable containers like RecyclerView.
Asynchronous operations need extra handling and often lead to flaky tests.
Even when these issues are mitigated, manually writing tests for every possible interaction is tedious and may miss details such as passed parameters or network requests.
Core Idea: Interaction Approval Testing
All observable changes caused by a click can be expressed as text.
Every clickable View in an Activity can be programmatically clicked and its result observed, producing a report that maps views to actions.
Pages can be tested in isolation; side‑effects that affect other pages are expressed via interfaces (e.g., opening a new page or returning a result).
We do not need to test inter‑page navigation itself—only the handling of inputs and the verification of outputs.
Implementation Steps
Ensure the test page is built with simulated state and wait for the layout to stabilize.
Depth‑first traverse the view hierarchy, locating each View that can be clicked or long‑pressed.
Execute a click, record the generated Action, and prevent further logic from running.
Output the result of each click to a JSON file.
Compare JSON files between runs to detect changes, similar to screenshot diffing.
Handling RecyclerView and Dynamic View Changes
Before traversing, the test scrolls any RecyclerView to make all child views visible. After a click that may modify the view hierarchy (e.g., fragment transaction, dialog), the activity is reset by removing all fragments and re‑adding a fresh simulated fragment, allowing the traversal to continue from the last index.
Recording Actions
Two categories of actions are captured:
Android framework actions such as fragment transactions or activity launches.
Airbnb‑specific actions like MvRx state updates or network requests.
Fragment lifecycle callbacks are registered to capture fragment stack changes, and reflection on WindowManagerGlobal is used to detect dialogs (e.g., AlertDialog, BottomSheetDialog) and extract their titles, contents, and button texts.
Capturing Non‑Visual View Data
Beyond click events, the system records non‑visual attributes that screenshot tests miss, such as:
Accessibility contentDescription.
URLs loaded in WebView or ImageView.
Configuration of VideoView.
During traversal each view is callback‑ed, its type inspected, and any desired information added to the report.
JSON Report Format
The report consists of a JSON object per clickable view. The outermost key encodes the view’s hierarchical path using parent view IDs, ensuring a stable identifier across builds.
Example entries show the originating fragment, parameters, request codes, and any subsequent actions such as activity launches, log entries, or state updates.
Readability, Diffability, and Consistency
Key names in the JSON are chosen for clarity. While metadata can aid readability, excessive metadata (e.g., RecyclerView item indices) can cause noisy diffs when items are added or removed. The report only shows changes when a view’s behavior actually changes.
To keep diffs stable, keys are sorted and identifiers are based on view IDs rather than positional indices. For objects lacking a meaningful toString(), reflection is used to generate a deterministic string representation, and Android resource integers are replaced with their symbolic names.
Action Types Demonstrated
Activity closure with returned data.
Schema‑based internal logging.
Recording arbitrary view properties such as content description or image URLs.
Toolbar option clicks (name, ID, resulting action).
Fragment switches with full parameter capture.
Network requests including method, headers, and body.
ViewModel state updates, property changes, and collection mutations.
Integration with CI
The JSON snapshots are uploaded to AWS and visualized alongside screenshot diffs using the Happo library. Pull‑request builds generate a combined report, allowing engineers to see both UI and interaction changes in a single view.
Future Work
Programmatically edit EditText content and verify outcomes.
Capture onActivityResult callbacks.
Log fragment creation/destruction events (e.g., network calls, logs) and include them in the final report.
The approach has been running for a few months and already reduces manual test maintenance while providing richer regression data.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Airbnb Technology Team
Official account of the Airbnb Technology Team, sharing Airbnb's tech innovations and real-world implementations, building a world where home is everywhere through technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
