Artificial Intelligence 20 min read

Baidu Mini‑Program Online Quality Assurance System: AI‑Driven Automated Traversal, Page Anomaly Detection, and Cloud‑Phone Cluster

This article describes how Baidu built an end‑to‑end online quality‑assurance platform for its mini‑program ecosystem, leveraging AI‑powered automated traversal, intelligent page‑exception detection, and a scalable cloud‑phone cluster to identify red‑line issues, improve audit efficiency, and reduce manual effort.

Baidu Intelligent Testing
Baidu Intelligent Testing
Baidu Intelligent Testing
Baidu Mini‑Program Online Quality Assurance System: AI‑Driven Automated Traversal, Page Anomaly Detection, and Cloud‑Phone Cluster

1. Overall Background

Baidu mini‑programs run in massive numbers across dozens of host apps, creating three challenges for QA: huge quantity, diverse hosts, and many distribution scenarios. QA must both internally ensure framework quality and externally guarantee online program health, reducing red‑line incidents while improving audit speed.

2. Baidu Mini‑Program Online Quality‑Assurance System

The system aims to automatically discover red‑line problems across the entire online mini‑program fleet, protect the ecosystem, and boost audit efficiency. It focuses on three core capabilities:

Automated traversal to collect runtime information.

Page‑exception detection on the collected screenshots, DOM, and source code.

A cloud‑phone cluster that provides large‑scale parallel inspection resources.

3. Development of Automated Traversal Capability

3.1 Foundations – Baidu Mini‑Program Automated Test Engine

The engine consists of three parts:

Bat Engine – core code injected into the mini‑program runtime via hot‑load, establishing two‑way communication with Bat Agent.

Bat Agent – a WebSocket server running on the PC, providing multi‑device, multi‑program synchronous control.

Bat Driver – a NodeJS client library offering simple APIs for writing traversal scripts.

The engine supports four endpoints (real device, development board, cloud phone, web) across all host apps, achieving 90% coverage, ~100 ms per command, and 99.9% stability over >10 million cloud tasks.

3.2 Initial Exploration

Stage 1 used a monkey‑style random traversal, quickly gathering data but suffering low click‑through rates and high resource consumption. Stage 2 introduced behavior‑prediction based on historical audit and QA testing data, using neural‑network models to predict clickable regions from screenshots. Stage 3 shifted to target‑recognition‑based traversal, focusing on key controls identified through image segmentation, OCR, icon detection, color analysis, and page‑structure‑tree generation.

3.3 Final Solution – Target‑Recognition Traversal

The final workflow includes:

Selecting key controls based on red‑line standards and audit feedback.

Generating a page‑structure tree by image slicing, OCR, icon detection, color analysis, element attribute judgment, aggregation, and block division.

Recognizing controls (article lists, product cards, bottom tabs) using element distribution and visual features.

Applying deep‑learning models (YOLO V3) to supplement tree‑based detection and improve recall.

Extending from single controls to scenario‑based checks (e.g., login flow).

Deploying the traversal scripts and models on a cloud strategy center that communicates with the scripts over the network.

4. Page‑Exception Detection Capability

4.1 Overview

The detection pipeline processes screenshots, text, runtime DOM, and source code using computer‑vision, NLP, and static‑code analysis to spot red‑line, abnormal, or experience issues.

4.2 Screenshot Comparison

Traditional similarity metrics fail due to device‑specific variations. Baidu combines deep convolutional feature extraction with page‑structure‑tree positional checks to achieve robust similarity judgments.

4.3 White‑Screen Detection

Three white‑screen scenarios are handled:

Full or partial white screens – color and complexity analysis per region yields a white‑screen rate compared against configurable thresholds.

Long loading screens – detection of loading icons and text.

Partial image‑load failures – DOM‑based resource checks and structure‑tree‑based image‑error identification.

To reduce false positives on multi‑level pages, Baidu adds scene‑aware rules (e.g., empty shopping‑cart pages are acceptable) and compares current metrics against historical trends for each page.

5. Cloud‑Phone Cluster Construction

Three generations of the test farm were built to balance coverage, cost, and operational overhead:

Version 1.0: Large inventory of real devices, manually operated.

Version 2.0: Unified development boards replace many real devices, enabling semi‑automatic maintenance.

Version 3.0: Baidu Cloud’s cloud‑phone service replaces hardware, reducing cost dramatically and allowing fully automated operations with less than ten lines of code to control each instance.

6. Business Impact

Daily real‑device inspection volume exceeds 200 k tasks with >99.3% success rate.

Red‑line issues for all mini‑programs are now recalled in near‑real time, enabling pre‑submission checks and continuous post‑launch monitoring.

Since launch, >100 k problems have been identified, affecting >80 k programs.

Automated detection now accounts for >82% of all recalled issues, most of which are auto‑remediated.

Supports over ten major operational campaigns with >85% recall rate, ensuring smooth event execution.

AIAutomated Testingquality assuranceImage Recognitionmini‑programscloud phone
Baidu Intelligent Testing
Written by

Baidu Intelligent Testing

Welcome to follow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.