Boosting Test Code Quality: How Large Language Models Transform Code Review
This article explores how mature testing teams can leverage large language models for automated code review, outlining the advantages, challenges, and a practical implementation using FastGPT and GitLab CI to build a low‑cost, AI‑enhanced review system that improves efficiency and feedback quality.
Preface
Mature testing teams generate large amounts of automation code, but code quality can vary due to individual habits. Expert code review (CR) is traditionally required, and using large language models for CR is becoming a trend to improve efficiency and effectiveness.
Advantages of Large‑Model CR
Automated checks : Large models can quickly spot syntax errors, style issues, and potential security vulnerabilities, reducing manual effort.
Reduced repetitive work : Models filter out low‑level mistakes, allowing reviewers to focus on logical and design concerns.
Fast feedback : Integrated with GitLab CI, the model can run CR immediately after a commit and provide instant feedback, shortening the feedback loop.
For detailed prompt examples and code modification methods, see the referenced article “How to Use ChatGPT for Code Review in GitLab”.
In practice, a Merge Request (MR) can trigger a GitLab CI job that calls a middle‑platform to invoke the model and return feedback to GitLab.
Challenges of Model‑Based CR
Surface‑level analysis only : Models may miss deeper domain‑specific logic, especially in UI automation where they cannot distinguish generic from business‑specific methods.
Consistency checks : Naming conventions and test data management vary across teams, leading to possible false positives.
Noise in fix suggestions : When suggestions are not 100% reliable, adding comments to MR can be cumbersome, and resolving them may be unclear.
Enhancing CR with RAG
Improvement ideas:
Construct prompts that embed constraints and build a knowledge‑base system for standardized guidance, leveraging Retrieval‑Augmented Generation (RAG) concepts.
Deliver review results via enterprise messaging (e.g., DingTalk) instead of GitLab comments, keeping suggestions as reference while human reviewers handle the final actions.
We use FastGPT as the RAG platform (alternatives include Dify, QAnything). The following diagram illustrates the CR workflow.
Setup Steps
Initialize CR Knowledge Base
Key personalized information is stored in the knowledge base:
Testing best‑practice methods and semantics.
Custom test‑code style guidelines.
Business‑line specific requirements, such as template code.
Create FastGPT Project
Adapt the FastGPT UI for internal use while following its application definition.
Select the previously built knowledge base for the new project.
Configure the FastGPT application to expose an API.
Select Repository for CR
Identify the GitLab project ID of the repository to be reviewed.
Optimize Prompt
Example prompt for test code review:
<code>假设你是测试开发工程师,现在你的任务是阅读GitLab代码变更,使用中文给出修改建议。
建议要求:
a.必须使用知识库给出建议;生成的建议语句通顺;
b.指明具体位置,使用例子进行解释,只需要给出少量关键代码不要全部输出;
d.不要脱离给定的代码;
e.不要复述原代码;
f.不要给出重复建议;
g.评审建议不超过700字符;
下面是GitLab代码变更:(后面附上MR的变更代码)
</code>Trigger Review
Invoke the review via a GET request:
<code>curl --location "https://fastqa.xxxxx.com/api/review/work?projectId=$CI_MERGE_REQUEST_PROJECT_ID&mrId=$CI_MERGE_REQUEST_IID"</code>Add this command to the GitLab CI pipeline. When an MR is created, the review task runs and sends results to the enterprise messaging bot.
View Results
The bot (DingTalk/WeChat Work) delivers the review summary. Reviewers can then copy valuable suggestions back to GitLab and mark the result as effective for future optimization.
Practical Internal CR Experience
We extracted real automation code from our company and applied the system.
Significant number of useful suggestions were generated.
Some false positives occurred, related to AI uncertainty, but could be mitigated by refining the knowledge base.
Conclusion
This article introduced the concept of using large language models for test code review and described a low‑cost implementation based on FastGPT and a generic model. As AI capabilities continue to improve, automated code review is expected to become increasingly reliable and valuable. Readers are invited to discuss and follow future AI practice articles.
Recommended Reading:
AI Series – Building Private vs. General Large Models for Testing Teams
FastGPT Exploration in Ticket Handling
CoolHome at MTSC2024 – Highlights
CoolHome Internationalization and Multi‑Language Practices
Frontend Design Tool Performance Investigation
Using Tampermonkey to Boost Test Efficiency
Effective Test Retrospectives for Large Projects
Qunhe Technology Quality Tech
Kujiale Technology Quality
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.