Practicing an AST‑Driven MCP Code Context Service for AI Code Review

The article describes how an AST‑based code‑context service, wrapped by an MCP middleware and built with Spring AI and Eclipse JDT, supplies structured Java code information to large models, addressing the context gaps of diff‑only AI code review and improving accuracy through concrete examples and evaluation.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Practicing an AST‑Driven MCP Code Context Service for AI Code Review

In everyday development, AI‑assisted code review often relies only on Git diffs, which omit class structures, method dependencies, and cross‑file call relationships, leading to inaccurate model judgments. To remedy this, the authors built an AST‑driven solution that uses a middleware (MCP) to provide complete, traceable, structured context to large models.

System Architecture

The backend is divided into three loosely coupled layers: the code‑review main service (entry point, request validation, model invocation, result aggregation), a model‑capability extension layer that implements the Model Context Protocol (MCP) to expose unified interfaces, and a tool layer that encapsulates low‑level capabilities such as AST parsing and Git operations.

Core Technology Choices

MCP is an open‑source protocol introduced by Anthropic for standardising model‑tool interactions. It is implemented with Spring AI to minimise integration effort and benefit from auto‑configuration, dependency injection, observability and security features. AST parsing is performed with Eclipse JDT, which supports multiple Java language versions and provides a rich API for extracting classes, methods, variables and call graphs.

MCP Service Workflow

When a review request arrives, the service creates a task ID, fetches the corresponding source version, and invokes the AST module to build a code‑reference index. A compilation‑management module generates and caches compiled artefacts for static languages, while a source‑management module organises versions by Git commit. The MCP tool then aggregates these capabilities, trims the data, and returns a JSON payload that the model can consume.

Code‑Context Design

For global variables the service provides metadata, definition source, and all global references. For methods it supplies the method’s metadata, full source, line‑by‑line extracted calls, and the method’s global call sites. The design deliberately limits recursion depth to one level up and down from the diff‑affected element to balance information gain against token cost.

Interface Specification

The MCP interface accepts a public class McpContextRequest with fields: taskId (global identifier), analysisFilePath (target file), and diffLineNumberSequence (list of start‑end line numbers). Prompt engineering follows five sections—overview, input parameters, output description, usage scenarios, and precautions—to guide the model in invoking the tool correctly. Initially, separate granular interfaces were tried (e.g., get method info by line, fetch referenced elements), but instability led to a unified context interface.

Result Structure

The returned JSON contains two main parts linked by unique element IDs: "element relations" (describing which code fragments belong to which elements and their cross‑references) and "element information" (metadata and source code for each element). This design avoids duplicate transmission of the same element data.

Effect Verification

Two error‑prone cases were tested. In a null‑pointer scenario, the diff‑only model could not infer that normalize might return null, while the AST‑augmented model identified the risk and suggested a fix. In a multithreading case, the diff‑only model only warned about parallelStream usage, whereas the context‑rich model examined the write implementation, recognized potential thread‑safety issues, and offered concrete remediation advice. Images in the original article illustrate the before‑and‑after results.

Conclusion

The AST‑driven MCP service supplies a uniform, searchable code‑context layer between code‑analysis systems and large models, mitigating the common context‑deficiency problem in AI code review. Its layered, decoupled design eases integration into existing pipelines and can be extended to other scenarios such as impact analysis and architecture inspection. Future work includes multi‑language AST support, performance optimisation, and simplified deployment for larger codebases.

JavaASTMCPSpring AIAI code reviewEclipse JDT
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.