How AI is Transforming Automation: From Scripts to Intelligent Systems
This article examines the evolution of automation from basic scripting to AI‑driven intelligent systems, compares traditional and smart automation across multiple dimensions, and showcases practical implementations using Playwright, MidScene.js, and Chrome bridge mode with code examples for web and mobile testing.
Evolution and Current State of Automation Technology
In the wave of digital transformation, automation has progressed from simple script execution to complex systems with AI‑driven decision‑making. Gartner predicts that by 2025 more than 70% of enterprises will adopt some form of AI‑powered automation, boosting efficiency and adaptability.
Traditional Automation vs Intelligent Automation
Traditional tools handle repetitive tasks but struggle with dynamic web elements and complex user interactions. AI can understand context through machine‑learning algorithms, make intelligent decisions, and adjust execution strategies in real time.
Key differences include element locating (precise selector vs visual‑semantic hybrid), workflow design (fixed vs goal‑based dynamic paths), exception handling (pre‑defined try‑catch vs real‑time diagnosis), test data (static vs dynamically generated), maintenance cost, execution speed, accuracy, and suitable scenarios.
Code Comparison
Traditional Automation
async function testLogin(page) {
await page.fill('#username', 'testuser');
await page.fill('#password', 'Pass123!');
await page.click('#login-btn');
await expect(page).toHaveURL(/dashboard/);
}Intelligent Automation
async function smartLogin(page, ai) {
const context = {
pageHTML: await page.content(),
task: "完成登录操作",
constraints: "使用有效测试凭证"
};
const plan = await ai.generateActionPlan(context);
for (const action of plan.actions) {
if (action.type === 'fill') {
const element = await ai.locateElement({ page, description: action.field });
await element.fill(await ai.generateTestData(action.field));
}
// handle other action types...
}
const result = await ai.verifyOutcome({ page, expected: "成功登录" });
}Advantages of intelligent automation include automatic adaptation to changes in login‑form structure.
Technologies Used
What is Playwright?
Playwright is a cross‑browser, cross‑platform web automation and testing tool from Microsoft, supporting Chromium, Firefox, and WebKit. It provides a unified API for end‑to‑end testing, UI automation, screenshot & PDF generation, dynamic page scraping, and performance monitoring.
What is MidScene.js?
MidScene.js is an AI‑enhanced automation framework that adds large language model (LLM) capabilities to traditional tools like Playwright, enabling natural‑language task description, multimodal interaction, low‑code/no‑code friendliness, and enterprise‑grade extensibility.
Technical Architecture
Web or Mobile Automation
Web Automation
Integration with Puppeteer or Playwright, installation commands, and demo scripts are provided.
Example with Puppeteer:
npm install @midscene/web puppeteer tsx --save-dev import puppeteer from "puppeteer";
import { PuppeteerAgent } from "@midscene/web/puppeteer";
const sleep = (ms: number) => new Promise(r => setTimeout(r, ms));
Promise.resolve(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 800, deviceScaleFactor: 1 });
await page.goto("https://www.ebay.com");
await sleep(5000);
const agent = new PuppeteerAgent(page);
await agent.aiAction('在搜索框输入 "Headphones" ,敲回车');
await sleep(5000);
const items = await agent.aiQuery('{itemTitle: string, price: Number}[], 找到列表里的商品标题和价格');
console.log("耳机商品信息", items);
await agent.aiAssert("界面左侧有类目筛选功能");
await browser.close();
})();Playwright Integration
npm install @midscene/web playwright @playwright/test tsx --save-dev import { chromium } from 'playwright';
import { PlaywrightAgent } from '@midscene/web/playwright';
import 'dotenv/config';
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
Promise.resolve(async () => {
const browser = await chromium.launch({ headless: true, args: ['--no-sandbox','--disable-setuid-sandbox'] });
const page = await browser.newPage();
await page.setViewportSize({ width: 1280, height: 768 });
await page.goto('https://www.ebay.com');
await sleep(5000);
const agent = new PlaywrightAgent(page);
await agent.aiAction('type "Headphones" in search box, hit Enter');
await agent.aiWaitFor('there is at least one headphone item on page');
const items = await agent.aiQuery('{itemTitle: string, price: Number}[], find item in list and corresponding price');
console.log('headphones in stock', items);
const isMoreThan1000 = await agent.aiBoolean('Is the price of the headphones more than 1000?');
console.log('isMoreThan1000', isMoreThan1000);
const price = await agent.aiNumber('What is the price of the first headphone?');
console.log('price', price);
const name = await agent.aiString('What is the name of the first headphone?');
console.log('name', name);
const location = await agent.aiLocate('What is the location of the first headphone?');
console.log('location', location);
await agent.aiAssert('There is a category filter on the left');
await agent.aiTap('the first item in the list');
await browser.close();
})();Chrome Bridge Mode
MidScene.js offers a Chrome bridge mode allowing scripts to control a desktop Chrome instance, reusing cookies, extensions, and page state.
npm install @midscene/web tsx --save-dev import { AgentOverChromeBridge } from "@midscene/web/bridge-mode";
const sleep = (ms) => new Promise(r => setTimeout(r, ms));
Promise.resolve(async () => {
const agent = new AgentOverChromeBridge();
await agent.connectNewTabWithUrl("https://www.bing.com");
await agent.ai('type "AI 101" and hit Enter');
await sleep(3000);
await agent.aiAssert("there are some search results");
await agent.destroy();
})();Set MIDSCENE_CACHE=1 to enable caching and accelerate test execution.
Android Automation
Automation on Android devices can be performed using the MCP tool.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
