How AI Transforms XPath Generation for Robust UI Automation
This article explains how an AI‑driven tool dramatically reduces XPath creation costs and improves stability for complex 3D‑canvas web applications by combining prompt engineering, HTML compression, self‑checking, retry mechanisms, and specialized handling for inputs and 3D model selectors.
Background
At Qunhe Technology, UI automation is crucial for product quality, but the 3D‑canvas‑centric KuJiaLe cloud design tool presents challenges such as high XPath maintenance costs due to dynamic elements.
The team explored techniques to make XPath more robust and lower maintenance, eventually building an AI‑driven XPath generation tool.
1. Tool Effect Demonstration
Multiple modes for different scenarios
XPath result conversion for personalized needs
Compatibility with KuJiaLe 3D model positioning
Highly unified XPath style
The AI‑generated XPath closely matches human‑written conventions.
2. Plugin Design
The tool is built as a Tampermonkey script combined with an AI workflow, chosen for its low development barrier and existing test framework.
2.1 Full Process Diagram
The plugin sends HTML data, target node, and generation rules to the AI, which returns a robust XPath.
HTML is compressed to reduce token cost while preserving accuracy; generated XPath undergoes self‑validation and retry for stability.
2.2 AI Workflow Design
Different scenarios use specific LLMs and prompts; prompts define AI roles, output format, and include many positive and negative XPath examples to reduce hallucinations.
3. Key Issues Explained
3.1 Ensuring XPath Accuracy
Self‑checking: JavaScript validates whether the XPath matches a unique target element.
Retry mechanism: Up to two retries with alternative LLMs if validation fails.
Model selection: Different LLMs are used for different scenarios to improve stability.
Multiple results: At least two XPath candidates are offered for user selection.
3.2 HTML Data Compression Strategy
HTML scope is limited to key page regions (left/right panels, header, canvas) and unnecessary tags like scripts or images are stripped.
Further simplifications include extracting only modal HTML or specific module hierarchies.
3.3 Shortcut Key Trigger
Because some elements appear only on hover, the tool uses a keyboard shortcut to invoke AI generation, avoiding mouse‑related side effects.
3.4 Special Logic for INPUT Elements
XPath generation for inputs relies on associated label elements to improve readability and extensibility; a dedicated INPUT mode with custom prompts handles these cases.
3.5 Compatibility with 3D Model Locator
Traditional XPath cannot locate 3D canvas elements, so a proprietary 3D model selector is integrated into the plugin.
4. Deployment Results
In four months, the tool averages 80 XPath generations per week, with over 60% of results copied; more than 80% of XPaths are generated within five seconds, consuming 1k‑12k tokens each.
User surveys report a 10%‑30% efficiency gain in UI automation authoring.
5. Future Plans and Insights
Remaining issues include occasional non‑unique XPaths and timeouts (≈20% retry rate) and variability across runs.
Introduce an AI scoring mechanism to filter results before finalizing.
Add manual retry with customizable rules for higher accuracy.
AI has dramatically accelerated tool development; over 95% of the code in the demo was AI‑assisted, enabling rapid prototyping even with limited H5 skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
