Frontend Development 16 min read

Boosting Frontend Code Review Quality: Lessons from Alibaba’s DEF Platform

This article examines the quality challenges of Alibaba's DEF frontend code review system, identifies issues such as short review times, low comment rates, and oversized changes, and proposes practical solutions including staged reviews, better notifications, comment defect tagging, reviewer recommendation, and static analysis integration.

Taobao Frontend Technology

Aug 24, 2021

Boosting Frontend Code Review Quality: Lessons from Alibaba’s DEF Platform

DEF (Alibaba Frontend R&D Platform) built a CodeReview system on Aone and KAITIAN, aiming to improve frontend code review experience and quality evaluation. The "CR Quality Thinking" series has three parts:

CR Quality Thinking: Metrics and Current Situation

CR Quality Thinking: Solutions

CR Quality Thinking: Phased Review Practice

This article analyzes the quality issues identified in the first part and proposes solutions based on industry practices.

Background

The previous analysis shows DEF's overall CR quality has room for improvement, mainly:

Review time is too short, leading to many "quick-pass" reviews.

Comment functionality and workflow provide limited help, resulting in low comment rates.

Large CRs (>400 lines) and extreme CRs (>2000 lines) are common, lowering review quality.

About CR Review Time

When to Initiate Review

According to DEF's current workflow, we can infer why review time is short:

DEF's process revolves around development and release. Developers iterate in daily build and verification environments until ready for production, then submit a CR before release. Two issues arise:

CR is raised only before release, limiting available review time.

If reviewers find issues, developers must repeat the build‑verify cycle, increasing cost.

A better process runs review in parallel with daily build and verification, allowing staged CRs during development. This is the idea of Phased Code Review.

Review Timeliness

Consider a typical scenario: a developer finishes work on Monday, submits a CR, and plans to release on Thursday afternoon. After receiving email and DingTalk reminders, reviewers may postpone the review, forgetting the task. By Thursday, the developer discovers the CR is still pending and must chase the reviewer, who then rushes the review, leading to "quick-pass".

We can borrow the group approval flow (BPMS) to reduce the need for developers to chase reviewers:

Provide a personal task center (CR list) for daily processing of assigned tasks.

The platform controls deadline reminders (timed & countdown notifications) and adds in‑app messaging, avoiding offline communication.

To ensure notification reach, CR should at least send email alerts; optionally DingTalk notifications; for fast‑moving projects, a DingTalk group bot can push any CR activity.

This mechanism should guarantee sufficient review time and reduce rushed approvals.

Collapsing Unchanged Regions

IDE diff view shows both old and new files, consuming review time. The CR tool should automatically collapse unchanged code regions to focus on changes.

About CR Feedback

Improving Comment Importance

DEF's current comment feature is basic: line‑level comments with simple threaded view.

Comments should be a core part of the review loop. Reviewers can mark a comment as a defect; developers resolve it by submitting new code or replying. The review only passes when all defects are resolved, significantly increasing comment relevance.

[1] Substantially, they found an agreement in terms of a quick, lightweight process (CP1,CP2,CP3) with few people involved (CP4) who conduct group problem solving (CP5).

Diverse Feedback Methods

Research shows tone and wording affect comment usefulness. Supporting emoji reactions and a global "LGTM" comment, as in GitHub, can improve reviewer experience.

Finally, although majority (51%) of the comments had ‘Extremely Negative’ tones, only 57% of those comments were useful. On the other hand, comments with ‘Neutral’ or ‘Somewhat Negative’ tones were more likely to be ‘Useful’ (≈ 79% were considered useful).

About CR Code Size

Median change size is healthy (~40 lines), but ~20% of CRs exceed 400 lines and ~10% exceed 2000 lines, harming quality. Causes:

Not using DEF's multi‑change integration, leading to large single CRs.

Complex features that inherently require many lines.

DEF is promoting a multi‑change integration mode to split large iterations into smaller changes.

Two common industry solutions:

Developers follow commit‑message conventions and split work into separate commits; reviewers examine at commit granularity.

At final review, use machine‑learning‑based intelligent split to divide changes into separate CRs.

Commit‑level diff requires high‑performance backend; current Aonecode and Antcode provide push‑level diff only.

Machine‑learning split has limited production use; research papers exist. For vertical front‑end projects, ML may work well.

Beyond ML, an engineering approach can replace commit nodes with push nodes, enabling staged reviews after each push.

We will detail Phased Code Review in the next article.

Further Optimization Opportunities

Reviewer Recommendation

Studies show reviewers with relevant background provide more useful suggestions. Machine‑learning reviewer recommendation can select the best reviewers automatically.

This study showed that experience with the code base is an important factor to increase the density of useful comments in code reviews. Therefore, we suggest that reviewers should be selected carefully. Automatic review suggestion systems can help to identify the right set of developers that should be involved in a review.

Static Scanning Mechanism

Integrating static analysis is a common demand. Results appear as comments on code lines; reviewers can mark them as "Please fix" or "Not useful".

DEF is actively developing CI capabilities to support this soon.

Conclusion

Based on identified quality issues, we propose solutions:

CR task center and reminder mechanism

Defect‑tagged comments and system checkpoints

Multiple notification channels for comments

Emoji support and global comments

Phased code review

Additional improvements include reviewer recommendation and static analysis integration. The first version of Phased Code Review is live and collecting feedback; further details will follow.

Automation code review quality improvement Review Process

Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.