How to Convert Requirements into Playwright Test Scripts Using Python
This article walks through a Python‑based test orchestrator that reads product requirements, generates Playwright + Pytest scripts via an LLM, executes them, analyzes failures, automatically fixes the code, and repeats the cycle until all tests pass or the retry limit is reached.
Overview
The project implements a closed‑loop test automation system written in Python. It reads a req.txt file containing product requirements, prompts a large language model (LLM) to generate Playwright + Pytest test code, runs the tests with pytest, analyses any failures, asks the LLM to fix the code, and iterates until the suite passes or the maximum retry count is hit.
Main entry point (main.py)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""One‑click start for the test automation system"""
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent / "skills" / "test-orchestrator" / "scripts"))
from orchestrator import TestOrchestrator
def main():
orchestrator = TestOrchestrator(
req_file="req.txt",
max_retry=5,
headless=True
)
success = orchestrator.run()
if success:
print("
测试自动化完成,所有用例通过!")
print(f" 最终测试文件: outputs/test_current.py")
else:
print("
达到最大重试次数,请人工介入检查")
print(f" 修复报告: outputs/fix_report.json")
return 0 if success else 1
if __name__ == "__main__":
sys.exit(main())TestOrchestrator (orchestrator.py)
The TestOrchestrator class coordinates the whole workflow. It initializes paths, logging, and component objects ( TestGenerator, TestExecutor, TestFixer). Key methods: read_requirements() – loads the requirement file. save_version() – stores each generated or fixed script under outputs/test_history with a timestamp. run_single_iteration() – generates code on the first pass, otherwise calls the fixer, writes the script to outputs/test_current.py, executes it via TestExecutor, records the result, and updates the fix report. run() – loops up to max_retry times, stopping when all tests succeed.
class TestOrchestrator:
def __init__(self, req_file="req.txt", max_retry=5, headless=True, output_dir="outputs", llm_config=None, custom_rules=None):
self.req_file = Path(req_file)
self.max_retry = max_retry
self.headless = headless
self.output_dir = Path(output_dir)
self.history_dir = self.output_dir / "test_history"
self.output_dir.mkdir(parents=True, exist_ok=True)
self.history_dir.mkdir(parents=True, exist_ok=True)
Path("logs").mkdir(exist_ok=True)
self.generator = TestGenerator()
self.executor = TestExecutor(headless=headless)
self.fixer = TestFixer()
self.current_test_file = self.output_dir / "test_current.py"
self.fix_report_file = self.output_dir / "fix_report.json"
self.fix_history = []
self.iteration = 0
self.preserve_punctuation = True
...TestExecutor (executor.py)
Runs a generated test file with pytest, captures stdout/stderr, parses lines containing PASSED or FAILED, extracts test names, and returns a structured dictionary with success flag, passed/failed lists, error message, exit code, and execution duration. It also handles timeout and generic exceptions.
class TestExecutor:
def __init__(self, headless: bool = True):
self.headless = headless
def run(self, test_file: str) -> Dict:
test_path = Path(test_file)
if not test_path.exists():
return {"success": False, "error": f"测试文件不存在: {test_file}", "exit_code": -1, "passed": [], "failed": []}
cmd = ["pytest", str(test_path), "-v", "--tb=short", "--color=no"]
if not self.headless:
cmd.append("--headed")
cmd.extend(["--json-report", f"--json-report-file={report_file}"])
...TestFixer (fixer.py)
Uses the OpenAI compatible API (DashScope) to classify errors, build a detailed fix prompt, and request corrected code. The prompt includes the original requirement text, the failing code snippet, and explicit repair rules (preserve punctuation, add waits, handle database cleanup, etc.). After receiving the LLM response, the fixer extracts the code block, validates that required elements ( def test, self.page, expect() are present, and retries up to three times.
class TestFixer:
def __init__(self):
self.client = OpenAI(api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://dashscope.aliyuncs.com/compatible-mode/v1")
self.model = os.getenv("QWEN_MODEL", "qwen-plus")
self.max_retries = 3
def _classify_error(self, error_msg: str, traceback: str = "") -> Dict:
error_lower = error_msg.lower()
if "timeout" in error_lower:
if "expect" in error_lower:
return {"type": "AssertionTimeout", "severity": "high"}
return {"type": "NavigationTimeout", "severity": "medium"}
...TestGenerator (generator.py)
Constructs a prompt that tells the LLM to produce Playwright + Pytest code directly from the requirement text. The prompt enforces strict punctuation preservation, parameterized test design, specific locator strategies, and a set of technical specifications (sync API import, parameterized usage, fixture for database cleanup, etc.). The returned code is cleaned, ensured to contain the required imports, and handed back to the orchestrator.
class TestGenerator:
def generate(self, requirements: str) -> str:
requirements = self._sanitize_text(requirements)
prompt = f"""
你是一个 Playwright + Pytest 自动化测试专家。请根据以下产品需求,直接生成可执行的端到端测试代码。
## 产品需求
{requirements}
## 核心要求
...
"""
response = self.client.chat.completions.create(model=self.model, messages=[{"role": "user", "content": prompt}], temperature=0.0, max_tokens=8000)
content = response.choices[0].message.content
return self._extract_code(content)Skill definition (SKILL.md)
The markdown file declares the skill name test‑orchestrator, version, and a concise description of the end‑to‑end workflow (read requirements → generate tests → run → auto‑fix → repeat). It also lists command‑line parameters and the artifacts produced ( outputs/test_current.py, outputs/fix_report.json, logs/test_run.log).
Requirement documents
Three requirement sections are provided: registration, login, and password‑recovery. Each lists field IDs (e.g., #username, #password), validation rules, expected error messages, database schema (tables user, code, password), and URLs. The document also enumerates normal and error test flows, which the generator uses to create parameterized test cases.
Overall, the article presents a reproducible, Python‑centric framework for turning textual product specifications into reliable Playwright test suites, leveraging LLMs for both code generation and automated debugging.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Woodpecker Software Testing
The Woodpecker Software Testing public account shares software testing knowledge, connects testing enthusiasts, founded by Gu Xiang, website: www.3testing.com. Author of five books, including "Mastering JMeter Through Case Studies".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
