Build a Test‑Specific AI Agent to Auto‑Generate Pytest Cases and Analyze Allure Reports
This guide presents an end‑to‑end solution for creating a test‑focused AI agent that indexes project code and defect data, integrates a large language model via LangChain, generates compliant Pytest cases, parses Allure reports, and offers deployment tips for seamless PyCharm integration.
Developers can empower their testing workflow by building a dedicated AI agent that understands project structure, historical defects, and testing standards, enabling automatic generation of high‑quality Pytest cases and intelligent analysis of Allure reports.
Why a test‑specific AI Agent?
Unlike generic AI assistants, a test‑oriented agent knows the tests/api and tests/ui directory conventions, can access private defect databases such as Jira, understands business rules like payment idempotency, and can parse Allure reports to provide actionable insights.
System Architecture
Core components include:
Knowledge base: project source code, defect data, and test documentation.
Vector engine: converts text to semantic vectors.
Large model: Qwen (high‑accuracy) or a local CodeLlama for privacy.
Agent router: selects the appropriate knowledge source based on query type.
PyCharm integration: seamless embedding within the IDE.
Step‑by‑step implementation
1. Build the knowledge base
1.1 Index project code
# index_code.py
from langchain_community.document_loaders import DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
def index_project_code(project_path: str, persist_dir: str):
# Load all Python files
loader = DirectoryLoader(project_path, glob="**/*.py")
docs = loader.load()
# Split by function/class while preserving context
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=600,
chunk_overlap=100,
separators=["
def ", "
class "]
)
splits = text_splitter.split_documents(docs)
# Vectorize and store
vectorstore = Chroma.from_documents(
documents=splits,
embedding=OllamaEmbeddings(model="nomic-embed-text"),
persist_directory=persist_dir
)
vectorstore.persist()
print(f"Indexed {len(splits)} code chunks to {persist_dir}")
index_project_code("./my_project", "./vectorstores/code")1.2 Build defect knowledge base
# index_defects.py
import pandas as pd
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
def index_defects(jira_csv: str, persist_dir: str):
df = pd.read_csv(jira_csv)
defect_texts = []
for _, row in df.iterrows():
text = f"""
缺陷ID: {row['issue_key']}
标题: {row['summary']}
描述: {row['description']}
根因: {row['root_cause']}
解决方案: {row['resolution']}
模块: {row['component']}
"""
defect_texts.append(text)
vectorstore = Chroma.from_texts(
texts=defect_texts,
embedding=OllamaEmbeddings(model="nomic-embed-text"),
persist_directory=persist_dir
)
vectorstore.persist()
index_defects("jira_defects.csv", "./vectorstores/defects")Data sources recommended:
Code: Git repository
Defects: Jira or Azure DevOps export
Documentation: Confluence or Markdown test specifications
2. Create the AI Agent core engine
2.1 Define tools
# tools.py
from langchain_core.tools import tool
from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings
@tool
def search_code(query: str) -> str:
"""Search the project code base and return relevant snippets"""
vectorstore = Chroma(persist_directory="./vectorstores/code", embedding_function=OllamaEmbeddings(model="nomic-embed-text"))
results = vectorstore.similarity_search(query, k=3)
return "
".join([doc.page_content for doc in results])
@tool
def search_defects(query: str) -> str:
"""Search the historical defect library and return similar cases"""
vectorstore = Chroma(persist_directory="./vectorstores/defects", embedding_function=OllamaEmbeddings(model="nomic-embed-text"))
results = vectorstore.similarity_search(query, k=2)
return "
".join([doc.page_content for doc in results])2.2 Build the agent executor
# agent.py
from langchain import hub
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_ollama import OllamaLLM
from tools import search_code, search_defects
def create_test_agent():
# Initialize the large model (can be swapped for Alibaba Cloud Baichuan API)
llm = OllamaLLM(model="qwen:7b", temperature=0.2)
# Load a prompt that supports function calling
prompt = hub.pull("hwchase17/openai-functions-agent")
tools = [search_code, search_defects]
agent = create_tool_calling_agent(llm, tools, prompt)
return AgentExecutor(agent=agent, tools=tools, verbose=True)
# Global agent instance
test_agent = create_test_agent()3. Add advanced capabilities
3.1 Allure report analysis
# allure_analyzer.py
import xml.etree.ElementTree as ET
def parse_allure_report(report_path: str) -> str:
"""Parse Allure report and extract failure patterns"""
tree = ET.parse(f"{report_path}/widgets/summary.json")
summary = tree.getroot()
failed_tests = []
for suite in summary.findall('.//testSuite'):
if suite.find('statistic/failed').text != "0":
suite_name = suite.get('name')
failed_count = suite.find('statistic/failed').text
failed_tests.append(f"{suite_name}: 失败 {failed_count} 个用例")
return "近期失败热点:
" + "
".join(failed_tests[:5])
@tool
def analyze_allure() -> str:
return parse_allure_report("./allure-report")3.2 Test case generator
# test_generator.py
TEST_PROMPT_TEMPLATE = """
You are a senior test engineer. Generate pytest cases for the given function.
Requirements:
1. Cover normal flow, exception flow, and boundary values.
2. Use @pytest.mark.parametrize.
3. Include Chinese comments.
4. Follow team naming convention: test_ prefix.
Function code:
{code}
"""
@tool
def generate_test_cases(code_snippet: str) -> str:
"""Generate test cases from a code snippet"""
from langchain_ollama import OllamaLLM
llm = OllamaLLM(model="qwen:7b")
prompt = TEST_PROMPT_TEMPLATE.format(code=code_snippet)
return llm.invoke(prompt)4. Deploy and integrate
4.1 Start an HTTP service
# server.py
from flask import Flask, request, jsonify
from agent import test_agent
app = Flask(__name__)
@app.route('/ask', methods=['POST'])
def ask_agent():
query = request.json['query']
if "生成测试" in query or "写用例" in query:
code = request.json.get('code', '')
result = generate_test_cases(code)
elif "allure" in query or "报告" in query:
result = analyze_allure()
else:
response = test_agent.invoke({"input": query})
result = response["output"]
return jsonify({"answer": result})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080)4.2 PyCharm integration via REST Client plugin
### Generate test case
POST http://localhost:8080/ask
Content-Type: application/json
{
"query": "生成用户登录测试用例",
"code": "def login(username, password): ..."
}
### Analyze defects
POST http://localhost:8080/ask
Content-Type: application/json
{
"query": "支付超时有哪些历史案例?"
}Effect validation
Scenario 1 – generate a compliant pytest case for calculate_discount:
import pytest
@pytest.mark.parametrize("user_level, expected_discount", [
("VIP", 0.8), # VIP gets 20% off
("regular", 1.0), # Regular users no discount
("", 1.0), # Empty level treated as regular
])
def test_calculate_discount(user_level, expected_discount):
"""Test discount calculation for different user levels"""
price = 100.0
result = calculate_discount(price, user_level)
assert result == price * expected_discountScenario 2 – analyze recent payment module failures:
DEF-1234: Third‑party gateway timeout (network spikes) – add 3 s retry.
DEF-1567: Over‑selling under concurrency – implement row lock + pre‑allocation.
Recommendation: introduce network‑delay chaos experiments in automated tests.
Pitfalls and optimization tips
Slow response: enable GPU acceleration for Ollama; shard the knowledge base for parallel retrieval.
Inaccurate answers: refine chunking strategy to split by function rather than fixed length; add post‑processing validation.
Memory overflow: limit vector store size; switch to FAISS for larger‑scale embeddings.
Key success factors: high‑quality knowledge injection, precise prompt design (e.g., "You are a financial testing expert"), and a feedback loop that records user ratings for continuous improvement.
Conclusion
Embedding a team’s testing expertise into an AI agent transforms implicit knowledge into reusable assets, delivering a 24/7 quality consultant, rapid onboarding for newcomers, and predictive defect‑prevention capabilities.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
