How We Leveraged JVM Agents and JaCoCo to Clean Up Legacy Java Code

This article explains how a long‑standing Java backend service was instrumented with JVM agents and JaCoCo to collect execution coverage, visualize results in an IntelliJ IDEA plugin, and systematically remove dead code, improving maintainability while minimizing impact on production performance.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How We Leveraged JVM Agents and JaCoCo to Clean Up Legacy Java Code

1. Background and Motivation

The D service, a legacy backend component dating back to the early Taobao mobile migration, accumulated a large amount of unused code, raising the learning curve for new developers and increasing maintenance costs. Manual code removal was risky, so a tooling‑based approach using code execution coloring and coverage analysis was devised.

2. Code Coverage Collection

Two primary mechanisms were explored:

JVM Agent : Uses the java.lang.instrument.Instrumentation API to modify bytecode at runtime, add class file transformers, and redefine classes. The agent is loaded via the -javaagent JVM argument, allowing continuous, low‑overhead data collection.

Attach API : Starts a separate JVM that attaches to the target process, loads an agent JAR, and performs instrumentation. This method is more flexible but incurs a temporary CPU spike and requires a restart to re‑attach.

Both approaches support dynamic bytecode modification, class loading inspection, and class redefinition.

2.1 JaCoCo Integration

JaCoCo inserts probes into the bytecode to record which lines are executed. The probe data is stored in a boolean array ( jacocoData) and written to .exec files. JaCoCo provides both an agent and a CLI tool for data collection and report generation.

String jarFile = args[0];
String pid = getPid(args);
VirtualMachine vm = VirtualMachine.attach(pid);
vm.loadAgent(jarFile);
vm.detach();

3. Comparison of Approaches

Agent‑based instrumentation offers stable, long‑term data collection with minimal runtime impact but requires JVM restart for activation. Attach‑based instrumentation is suitable for ad‑hoc analysis on single machines but cannot survive restarts.

4. Deployment Strategy

The final solution adopted the agent method combined with JaCoCo:

Download the JaCoCo runtime JAR into the Docker image.

Add -javaagent:/home/admin/app/jacoco-runtime.jar to the JVM startup parameters, restricting instrumentation to whitelisted packages (D, R, B) to limit overhead.

Periodically dump coverage data to OSS using a custom jacocoDump method that invokes JaCoCo’s FileOutput class.

Merge the .exec data with compiled .class files (obtained via Maven build) to generate detailed XML reports.

boolean jacocoDump(String filePath) throws IOException {
    Agent iAgent = Agent.getInstance();
    if (iAgent == null) { return false; }
    AgentOptions options = buildOptions(filePath);
    FileOutput out = new FileOutput();
    out.startup(options, iAgent.getData());
    out.writeExecutionData(true);
    return true;
}

5. Report Generation and IDEA Plugin

The merged coverage data is processed with JaCoCo’s XMLFormatter to produce an XML report, which is then uploaded to OSS. An IntelliJ IDEA plugin consumes this report, providing:

Toggleable visualization of line‑level coverage directly in the editor.

Automatic or manual download of the latest coverage data.

Configurable cache, OSS credentials, and data retention settings.

The plugin registers actions, a project service, and a configuration UI via IntelliJ’s extension points.

Code coverage visualization
Code coverage visualization

6. Results

Applying this pipeline to applications D, R, and B reduced dead code by 71 % in B, 43 % in R, and 8 % in D, demonstrating the effectiveness of automated coverage‑driven cleanup.

7. Lessons Learned

Understanding JaCoCo’s internals (visitor pattern, ASM bytecode manipulation) was essential.

Accurate class‑loader analysis is critical for hot‑deployment scenarios.

AI‑generated code can accelerate prototyping but may introduce complexity; manual debugging of IDEA internals remained necessary.

8. References

JaCoCo Agent Documentation: https://www.eclemma.org/jacoco/trunk/doc/agent.html

JaCoCo CLI Documentation: https://www.eclemma.org/jacoco/trunk/doc/cli.html

IntelliJ Plugin Development Guide: https://plugins.jetbrains.com/docs/intellij/plugins-quick-start.html

code coverageJaCoCoIDEA pluginJava Instrumentationcode cleanupJVM agent
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.