Artificial Intelligence 10 min read

Automating Test Case Generation with Large Language Models and LangChain

This article describes how large language models and the LangChain framework can be combined with PDF parsing, text chunking, memory management, and a vector database to automatically generate software test cases, achieving significant efficiency gains while outlining implementation details, results, and future challenges.

JD Retail Technology

May 27, 2024

Automating Test Case Generation with Large Language Models and LangChain

The article explains how the author leveraged large language models (LLMs) and the open‑source LangChain framework to automatically generate software test cases, aiming to improve test case writing efficiency.

It first outlines the practical effects of using AI for test case generation, then details the overall workflow, including PDF parsing with PyMuPDF, text chunking, memory management with ConversationBufferMemory, and integration with a vector database (Vearch) for scalable retrieval.

Technical specifics such as PDF content extraction, file splitting, use of ConversationSummaryBufferMemory, and the choice of the IVFFLAT index in Vearch are described.

Two representative code snippets are provided: one showing the end‑to‑end case generation function using LangChain chains and memory, and another demonstrating a vector‑search‑based approach that stores documents in Vearch and retrieves relevant content for LLM prompting.

def case_gen(prd_file_path, tdd_file_path, input_prompt, case_name):
    """用例生成的方法
    参数:
    prd_file_path - prd文档路径
    tdd_file_path - 技术设计文档路径
    case_name - 待生成的测试用例名称
    """
    # 解析需求、设计相关文档, 输出的是document列表
    prd_file = PDFParse(prd_file_path).load_pymupdf_split()
    tdd_file = PDFParse(tdd_file_path).load_pymupdf_split()
    empty_case = FilePath.read_file(FilePath.empty_case)
    # 将需求、设计相关文档设置给memory作为llm的记忆信息
    prompt = ChatPromptTemplate.from_messages([
        SystemMessage(content="You are a chatbot having a conversation with a human."),
        MessagesPlaceholder(variable_name="chat_history"),
        HumanMessagePromptTemplate.from_template("{human_input}"),
    ])
    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    for prd in prd_file:
        memory.save_context({"input": prd.page_content}, {"output": "这是一段需求文档，后续输出测试用例需要"})
    for tdd in tdd_file:
        memory.save_context({"input": tdd.page_content}, {"output": "这是一段技术设计文档，后续输出测试用例需要"})
    llm = LLMFactory.get_openai_factory().get_chat_llm()
    human_input = "作为软件测试开发专家，请根据以上的产品需求及技术设计信息，" + input_prompt + ",以markdown格式输出测试用例，用例模版是" + empty_case
    chain = LLMChain(llm=llm, prompt=prompt, verbose=True, memory=memory)
    output_raw = chain.invoke({'human_input': human_input})
    file_path = FilePath.out_file + case_name + ".md"
    with open(file_path, 'w') as file:
        file.write(output_raw.get('text'))

The experimental evaluation on a small project showed about a 50 % reduction in test case design time, with benefits such as comprehensive coverage, faster authoring, and fewer missed scenarios, while drawbacks include limited diagram understanding and the need for manual adjustments.

def case_gen_by_vector(prd_file_path, tdd_file_path, input_prompt, table_name, case_name):
    """!!!当文本超级大时，防止token不够，通过向量数据库，搜出某一部分的内容，生成局部的测试用例，细节更准确一些!!!
    参数:
    prd_file_path - prd文档路径
    tdd_file_path - 技术设计文档路径
    table_name - 向量数据库的表名，分业务存储，一般使用业务英文唯一标识的简称
    case_name - 待生成的测试用例名称
    """
    prd_file = PDFParse(prd_file_path).load_pymupdf_split()
    tdd_file = PDFParse(tdd_file_path).load_pymupdf_split()
    empty_case = FilePath.read_file(FilePath.empty_case)
    docs = prd_file + tdd_file
    embedding_model = LLMFactory.get_openai_factory().get_embedding()
    router_url = ConfigParse(FilePath.config_file_path).get_vearch_router_server()
    vearch_cluster = Vearch.from_documents(
        docs,
        embedding_model,
        path_or_url=router_url,
        db_name="y_test_qa",
        table_name=table_name,
        flag=1,
    )
    docs = vearch_cluster.similarity_search(query=input_prompt, k=1)
    content = docs[0].page_content
    prompt_template = "作为软件测试开发专家，请根据产品需求技术设计中{input_prompt}的相关信息:{content},以markdown格式输出测试用例，用例模版是:{empty_case}"
    prompt = PromptTemplate(input_variables=["input_prompt", "content", "empty_case"], template=prompt_template)
    llm = LLMFactory.get_openai_factory().get_chat_llm()
    chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
    output_raw = chain.invoke({'input_prompt': input_prompt, 'content': content, 'empty_case': empty_case})
    file_path = FilePath.out_file + case_name + ".md"
    with open(file_path, 'w') as file:
        file.write(output_raw.get('text'))

The author identifies remaining challenges such as OCR‑based extraction of flowchart images and plans future work to apply LLMs to code diff analysis, log inspection, and knowledge‑graph‑driven automated testing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python AI LangChain Vector Database test automation Large Language Model

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.