How AI‑Generated Test Cases Transformed Tencent Ads R&D Workflow
This article details how Tencent's advertising R&D team tackled lengthy, experience‑driven test case creation by deploying AIGC‑powered demand analysis, Prompt + RAG knowledge retrieval, and multi‑stage automated validation, ultimately boosting test case adoption from under 20% to nearly 60% while reducing manual effort and iteration time.
Business Background
Tencent Ads R&D faces a long advertising chain, many teams, frequent version iterations, and complex traffic sources, making testing and quality control difficult. The existing flow requires manual test case writing, which is time‑consuming and heavily reliant on individual experience.
Solution Overview
The team introduced an AIGC‑driven pipeline called “AIGC需求转手工用例” to automatically generate high‑quality, standardized manual test cases from TAPD requirements, reducing workload and improving consistency.
Automatic generation of manual test cases – Using AIGC with RAG knowledge bases to produce test points and cases directly from requirements.
Standardized four‑section case format – Converting generated cases into a four‑section template for downstream automation.
Productization
The solution was productized with a webhook‑triggered workflow: requirement change → case generation service (Prompt + RAG) → case management in 智研测试堂 → feedback loop updating TAPD.
Architecture Evolution
Initial architecture consisted of a single generation stage with ~20% adoption. Upgrades added three key modules (requirement understanding, case generation, feedback) forming a closed loop.
Stage 1: Requirement Understanding
Implemented Prompt + RAG to retrieve explicit and implicit business knowledge, supplementing external TAPD links and filling knowledge gaps via a knowledge base, improving coverage of domain‑specific terms such as pt12, pt19, pt46.
Stage 2: Test Case Generation
Adopted a two‑step task chain: first generate test points, then generate full cases, and introduced a validation agent to check coverage and iterate if thresholds are not met. Multi‑round experiments raised recall rates up to 94.59%.
Stage 3: Feedback Loop
Historical manually corrected cases are ingested into a case knowledge base, enabling retrieval for similar future requirements and further boosting adoption.
Stage 4: Model Evaluation Service
Built an evaluation pipeline that runs a test set through the AI service, compares generated cases with expected outputs, and computes precision/recall metrics, integrating the process into the 蓝盾 CI pipeline.
Prompt Auto‑Optimization
Introduced an A/B‑testing framework where the model automatically proposes improved prompts, runs parallel evaluations, and adopts the version that shows higher adoption or recall, effectively applying hyper‑parameter optimization concepts to prompt engineering.
Results
The end‑to‑end AI service increased test case adoption from below 20% to around 59%, with the knowledge base and feedback mechanisms contributing significantly to the improvement.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent Advertising Technology
Official hub of Tencent Advertising Technology, sharing the team's latest cutting-edge achievements and advertising technology applications.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
