How O2 CodeReview Finds the Correct Merge Base for Accurate Diff

This article explains how the O2 CodeReview tool, built on a pure‑frontend IDE, determines the appropriate merge‑base using Git data, handles multiple branch scenarios, and implements a version‑skip algorithm to reduce review workload while ensuring correct diff information.

Alibaba Terminal Technology
Alibaba Terminal Technology
Alibaba Terminal Technology
How O2 CodeReview Finds the Correct Merge Base for Accurate Diff

Background

DEF implemented O2 CodeReview in FY22 S1 using the pure‑frontend KAITIAN version. Because the IDE shows code tied to a specific commit, the diff is performed in the IDE rather than using stored diff data, which adds flexibility but requires custom handling of version information.

Three‑Way Merge Basics

Before describing the algorithm, the article reviews Git's three‑way merge strategy, explaining how the common ancestor (mergeBase) is identified to decide whether a change should be applied.

Obtaining mergeBase

Traditional code‑review tools compute diffs with git diff --merge-base and store the result, while IDE‑based tools can call git merge-base directly. However, in a pure‑frontend scenario the mergeBase must be provided by backend services, which adds implementation cost.

Alice’s platform (Aone) supplies a list of commits with parent_ids. By constructing a commit chain from this list, the earliest commit’s parent_id can be used as the mergeBase when there is a single common ancestor.

[
  {
    "author_email": "[email protected]",
    "author_name": "灰灰",
    "committer_email": "[email protected]",
    "committer_name": "灰灰",
    "created_at": "2020-09-17T18:13:52+08:00",
    "id": "4810d0faf6602dac68e447235f7a0e1da31d721e",
    "message": "权限申请
",
    "parent_ids": ["05cbd07eae346f6d246b5430b268d6963c8e4c25"],
    "short_id": "4810d0fa",
    "title": "权限申请"
  },
  {
    "author_email": "[email protected]",
    "author_name": "灰灰",
    "committer_email": "[email protected]",
    "committer_name": "灰灰",
    "created_at": "2020-09-21T16:33:32+08:00",
    "id": "c33cbf35cea4516659fd40364a1736cc5b4acd09",
    "message": "增加日志查看
",
    "parent_ids": ["4810d0faf6602dac68e447235f7a0e1da31d721e"],
    "short_id": "c33cbf35",
    "title": "增加日志查看"
  }
]

When multiple common ancestors exist, the algorithm selects the one with the shortest path, i.e., the first ancestor encountered during a backward search.

Special Cases

If a merge node’s two parent IDs both belong to the current branch, it cannot serve as a mergeBase. In non‑fast‑forward merges, the merge node introduces an extra common ancestor, and the algorithm must still pick the correct one based on path length.

Squash merges can mislead the algorithm because the commit time of the apparent ancestor may be far earlier than the actual branch point, causing an incorrect mergeBase.

Version‑Skip Algorithm

The goal is to avoid showing code that has already been reviewed while preventing new changes introduced by a shifted base from re‑appearing. The algorithm computes the intersection of base~head and base~revision diffs, replaces the base content with the base~revision changes, and ensures the head version remains unchanged.

Implementation Steps

Identify the latest reviewed revision.

Compute base~head and base~revision diffs.

Take the intersection of the two diff sets to filter out base‑driven changes.

Apply the filtered changes as a patch on top of the new base.

This approach eliminates already‑reviewed code and prevents new base changes from increasing review load.

CR Stage Integration

Previously, DEF’s code review acted as a release gate, causing developers to submit large CRs at the last minute. By embedding the version‑skip algorithm into the daily development flow, CR can be staged, reducing the amount of code per review and improving quality.

Embedding CR into the Development Process

Future work will migrate the CR capability from the KAITIAN plugin to the native IDE plugins used in DEF’s WebIDE and local IDEs, leveraging OS features for better code navigation and integrating with the release system to surface change information in real time.

Conclusion

While the current mergeBase analysis works for most cases, it will eventually be replaced by direct git merge-base results from the backend. Understanding Git commit chains remains essential for advanced version‑skip features, and the authors invite feedback on uncovered scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Software EngineeringCode reviewVersion ControlDiff Algorithmfrontend IDEgit merge-base
Alibaba Terminal Technology
Written by

Alibaba Terminal Technology

Official public account of Alibaba Terminal

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.