Artificial Intelligence 16 min read

How AI Turns Real‑World Operations into Automated E2E Test Cases

This article details an AI‑driven end‑to‑end testing solution that automatically generates test cases from real operation logs, compares traditional DOM‑based testing with AI‑enhanced approaches, selects Midscene + Qwen2.5‑VL‑72B as the execution engine, and builds a four‑stage pipeline that delivers code‑coverage metrics, platform dashboards, and a quality‑feedback loop for rapid product iteration.

DeWu Technology

Apr 22, 2026

How AI Turns Real‑World Operations into Automated E2E Test Cases

Project Background

Rapid growth of transaction business and frequent front‑end migrations (Vue → React → full‑stack) raised the standards for quality assurance. To keep up with fast iteration, the team needed an automated, intelligent E2E testing solution that could cover core flows, support regression verification, and provide reliable quality guarantees under existing resource constraints.

Value Benefits

The solution provides automated E2E test cases derived from real online operation behavior, discovers experience issues during page refactoring, and improves regression efficiency. Reported metrics include page code‑coverage ≥ X % and step‑execution success rate ≥ X %.

Solution Selection

A side‑by‑side comparison of traditional DOM‑based E2E and AI‑enhanced E2E was performed.

Traditional E2E (DOM) : locates elements via XPath/CSS, test cases are manually written or recorded, maintenance cost is high because DOM changes require manual updates.

AI E2E : uses visual recognition and semantic analysis to locate UI elements, automatically generates test cases from operation logs, reduces maintenance cost, and can explore unexpected paths, but suffers from black‑box opacity and higher model‑training costs.

Based on the analysis, the AI‑driven approach was chosen for its ability to support refactoring scenarios while significantly automating test case generation.

AI Tool Selection

Two candidate toolsets were evaluated:

Midscene : native JavaScript + WebExtensions API, open‑source AI‑driven UI automation, supports Playwright/Puppeteer, offers both DOM and visual analysis (e.g., Qwen‑VL, Chain‑VL). Advantages: JavaScript‑centric, low development cost, clear iteration rhythm, multimodal analysis. Disadvantage: token consumption and cost when invoking visual models.

browser‑use : Python + LLM + browser driver, based on Playwright, supports visual and non‑visual models, provides both DOM and visual analysis. Advantages: comprehensive capability. Disadvantage: token cost and Python‑stack learning curve.

Midscene was selected because it matches the existing JavaScript tech stack, offers multimodal analysis, and provides a clear engineering integration path.

Process Design

The pipeline consists of four stages:

Intelligent Test Case Generation – Convert operation logs from the performance monitoring system into executable scripts. Steps include data ingestion, behavior parsing (filtering noise, extracting atomic actions), script conversion (mapping actions to click/input/select with element descriptions), and case storage with unique IDs.

Flexible Execution Trigger – Schedule static test cases as dynamic tasks. Supports manual selection, timed jobs, and CI/CD‑driven event triggers. Tasks are pushed to a queue, enabling asynchronous, high‑concurrency execution.

AI‑Driven Execution – A backend service launches a headless browser, injects Midscene and the Qwen‑VL model, and performs smart replay. The AI interprets visual screenshots, matches intent to UI elements, executes actions, applies adaptive waiting, multi‑round locating, and handles exceptions (pop‑ups, network errors).

Platform Data Operation – After each step, success/failure status, error stack, screenshot, and code‑coverage data are recorded. Results are merged into a page‑level coverage report, persisted in a database, and the case status is updated to success or failure.

This end‑to‑end flow transforms real operation behavior into test cases, executes them intelligently, and closes the quality loop.

Technical Highlights

Test case generation directly from real‑world operation behavior.

Execution powered by Midscene + Qwen2.5‑VL‑72B, delivering precise visual UI interaction.

AI‑driven smart replay with adaptive waiting, multi‑locator strategies, and automatic error handling.

Code coverage collected as a hard metric, merged across steps to evaluate case value.

Platform Effects

Four dashboards provide actionable insights:

Page Overview Dashboard : aggregates PV/UV, case count, success‑rate trend, and coverage to identify high‑risk or low‑coverage pages.

Case Management List : filterable by domain, owner, success rate, tags; shows source, execution history, and allows manual labeling.

Deep Dive : step‑by‑step replay with error stack, screenshots, and coverage per step, turning black‑box failures into white‑box evidence.

Asset Optimization : uses coverage data to flag low‑value cases for cleanup and promotes stable cases to core regression suite.

Summary & Outlook

The AI‑driven E2E testing solution successfully automates test case generation, reduces maintenance overhead, improves coverage, and creates a data‑driven quality feedback loop. Future work will focus on enhancing model accuracy, extending from execution to quality prediction, and further standardizing the platform toward fully intelligent quality assurance.