Practicing an AST‑Driven MCP Code Context Service for AI Code Review
The article describes how an AST‑based code‑context service, wrapped by an MCP middleware and built with Spring AI and Eclipse JDT, supplies structured Java code information to large models, addressing the context gaps of diff‑only AI code review and improving accuracy through concrete examples and evaluation.
In everyday development, AI‑assisted code review often relies only on Git diffs, which omit class structures, method dependencies, and cross‑file call relationships, leading to inaccurate model judgments. To remedy this, the authors built an AST‑driven solution that uses a middleware (MCP) to provide complete, traceable, structured context to large models.
System Architecture
The backend is divided into three loosely coupled layers: the code‑review main service (entry point, request validation, model invocation, result aggregation), a model‑capability extension layer that implements the Model Context Protocol (MCP) to expose unified interfaces, and a tool layer that encapsulates low‑level capabilities such as AST parsing and Git operations.
Core Technology Choices
MCP is an open‑source protocol introduced by Anthropic for standardising model‑tool interactions. It is implemented with Spring AI to minimise integration effort and benefit from auto‑configuration, dependency injection, observability and security features. AST parsing is performed with Eclipse JDT, which supports multiple Java language versions and provides a rich API for extracting classes, methods, variables and call graphs.
MCP Service Workflow
When a review request arrives, the service creates a task ID, fetches the corresponding source version, and invokes the AST module to build a code‑reference index. A compilation‑management module generates and caches compiled artefacts for static languages, while a source‑management module organises versions by Git commit. The MCP tool then aggregates these capabilities, trims the data, and returns a JSON payload that the model can consume.
Code‑Context Design
For global variables the service provides metadata, definition source, and all global references. For methods it supplies the method’s metadata, full source, line‑by‑line extracted calls, and the method’s global call sites. The design deliberately limits recursion depth to one level up and down from the diff‑affected element to balance information gain against token cost.
Interface Specification
The MCP interface accepts a public class McpContextRequest with fields: taskId (global identifier), analysisFilePath (target file), and diffLineNumberSequence (list of start‑end line numbers). Prompt engineering follows five sections—overview, input parameters, output description, usage scenarios, and precautions—to guide the model in invoking the tool correctly. Initially, separate granular interfaces were tried (e.g., get method info by line, fetch referenced elements), but instability led to a unified context interface.
Result Structure
The returned JSON contains two main parts linked by unique element IDs: "element relations" (describing which code fragments belong to which elements and their cross‑references) and "element information" (metadata and source code for each element). This design avoids duplicate transmission of the same element data.
Effect Verification
Two error‑prone cases were tested. In a null‑pointer scenario, the diff‑only model could not infer that normalize might return null, while the AST‑augmented model identified the risk and suggested a fix. In a multithreading case, the diff‑only model only warned about parallelStream usage, whereas the context‑rich model examined the write implementation, recognized potential thread‑safety issues, and offered concrete remediation advice. Images in the original article illustrate the before‑and‑after results.
Conclusion
The AST‑driven MCP service supplies a uniform, searchable code‑context layer between code‑analysis systems and large models, mitigating the common context‑deficiency problem in AI code review. Its layered, decoupled design eases integration into existing pipelines and can be extended to other scenarios such as impact analysis and architecture inspection. Future work includes multi‑language AST support, performance optimisation, and simplified deployment for larger codebases.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
