Artificial Intelligence 10 min read

Boosting Test Code Quality: How Large Language Models Transform Code Review

This article explores how mature testing teams can leverage large language models for automated code review, outlining the advantages, challenges, and a practical implementation using FastGPT and GitLab CI to build a low‑cost, AI‑enhanced review system that improves efficiency and feedback quality.

Qunhe Technology Quality Tech

Aug 27, 2024

Boosting Test Code Quality: How Large Language Models Transform Code Review

Preface

Mature testing teams generate large amounts of automation code, but code quality can vary due to individual habits. Expert code review (CR) is traditionally required, and using large language models for CR is becoming a trend to improve efficiency and effectiveness.

Advantages of Large‑Model CR

Automated checks : Large models can quickly spot syntax errors, style issues, and potential security vulnerabilities, reducing manual effort.

Reduced repetitive work : Models filter out low‑level mistakes, allowing reviewers to focus on logical and design concerns.

Fast feedback : Integrated with GitLab CI, the model can run CR immediately after a commit and provide instant feedback, shortening the feedback loop.

For detailed prompt examples and code modification methods, see the referenced article “How to Use ChatGPT for Code Review in GitLab”.

In practice, a Merge Request (MR) can trigger a GitLab CI job that calls a middle‑platform to invoke the model and return feedback to GitLab.

Challenges of Model‑Based CR

Surface‑level analysis only : Models may miss deeper domain‑specific logic, especially in UI automation where they cannot distinguish generic from business‑specific methods.

Consistency checks : Naming conventions and test data management vary across teams, leading to possible false positives.

Noise in fix suggestions : When suggestions are not 100% reliable, adding comments to MR can be cumbersome, and resolving them may be unclear.

Enhancing CR with RAG

Improvement ideas:

Construct prompts that embed constraints and build a knowledge‑base system for standardized guidance, leveraging Retrieval‑Augmented Generation (RAG) concepts.

Deliver review results via enterprise messaging (e.g., DingTalk) instead of GitLab comments, keeping suggestions as reference while human reviewers handle the final actions.

We use FastGPT as the RAG platform (alternatives include Dify, QAnything). The following diagram illustrates the CR workflow.

Setup Steps

Initialize CR Knowledge Base

Key personalized information is stored in the knowledge base:

Testing best‑practice methods and semantics.

Custom test‑code style guidelines.

Business‑line specific requirements, such as template code.

Create FastGPT Project

Adapt the FastGPT UI for internal use while following its application definition.

Select the previously built knowledge base for the new project.

Configure the FastGPT application to expose an API.

Select Repository for CR

Identify the GitLab project ID of the repository to be reviewed.

Optimize Prompt

Example prompt for test code review:

假设你是测试开发工程师，现在你的任务是阅读GitLab代码变更，使用中文给出修改建议。
建议要求：
a.必须使用知识库给出建议；生成的建议语句通顺；
b.指明具体位置，使用例子进行解释，只需要给出少量关键代码不要全部输出；
d.不要脱离给定的代码；
e.不要复述原代码；
f.不要给出重复建议；
g.评审建议不超过700字符；
下面是GitLab代码变更：（后面附上MR的变更代码）

Trigger Review

Invoke the review via a GET request:

curl --location "https://fastqa.xxxxx.com/api/review/work?projectId=$CI_MERGE_REQUEST_PROJECT_ID&mrId=$CI_MERGE_REQUEST_IID"

Add this command to the GitLab CI pipeline. When an MR is created, the review task runs and sends results to the enterprise messaging bot.

View Results

The bot (DingTalk/WeChat Work) delivers the review summary. Reviewers can then copy valuable suggestions back to GitLab and mark the result as effective for future optimization.

Practical Internal CR Experience

We extracted real automation code from our company and applied the system.

Significant number of useful suggestions were generated.

Some false positives occurred, related to AI uncertainty, but could be mitigated by refining the knowledge base.

Conclusion

This article introduced the concept of using large language models for test code review and described a low‑cost implementation based on FastGPT and a generic model. As AI capabilities continue to improve, automated code review is expected to become increasingly reliable and valuable. Readers are invited to discuss and follow future AI practice articles.

Qunhe Technology Quality Tech

Kujiale Technology Quality

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.