Turn Natural Language into Safe Java Diagnostics with Arthas Agent

Arthas Agent adds an AI layer to the popular Java diagnostic tool Arthas, enabling natural‑language requests to be translated into secure, pre‑validated Arthas commands, with safety mechanisms, skill‑first design, and evidence‑based analysis for fast JVM troubleshooting.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Turn Natural Language into Safe Java Diagnostics with Arthas Agent

Background

Arthas is an open‑source Java diagnostic tool from Alibaba that has become widely used (over 37,000 GitHub stars). Its commands (e.g., watch, trace, tt, vmtool, ognl) are powerful but require frequent reference to documentation.

What’s New in Arthas 4.x

Support for the Model Context Protocol (MCP), enabling seamless integration with large AI models.

Re‑architected MCP server using a Streamable/Stateless single‑service design.

New MCP utilities such as sc and viewfile that can be invoked via MCP.

Enhanced watch / trace with ClassLoader hash support and a -M/--sizeLimit option.

Upgraded async‑profiler for significantly better performance analysis.

Arthas Agent Overview

Arthas Agent is an AI‑driven diagnostic layer built on top of Arthas. It translates natural‑language requests into safe Arthas commands, executes them, and returns analysis‑driven suggestions.

Natural‑language understanding: converts user intent into precise Arthas commands.

Safety‑first execution: strict parameter validation, whitelist, and permission control.

Intelligent analysis: derives diagnostic advice from command output.

Core Design Principles

Skill‑First

Each diagnostic capability is packaged as a well‑defined Skill with clear boundaries, predefined parameter templates, and explicit output formats, ensuring predictability.

用户需求 → AI 理解 → 选择合适的 Skill → 执行诊断

Safety‑First

Parameter whitelist: only safe parameter combinations are allowed.

Execution count limit: default -n 1 to avoid high‑frequency sampling.

Dangerous operation interception: blocks commands that could affect business services.

Timeout protection: automatically terminates long‑running commands.

Evidence‑Based

执行命令 → 获取输出 → 分析数据 → 得出结论

All conclusions are derived from actual command output.

Hands‑On Practice

Scenario 1: Locate High‑CPU Threads

Traditional commands:

# 1. 查看 CPU 使用最高的线程
thread -n 3
# 2. 根据线程 ID 查看堆栈
thread <tid>
# 3. 分析堆栈定位问题代码

Arthas Agent request:

用户:帮我看下哪个线程 CPU 最高,分析一下在干什么

The agent runs thread -n 3, fetches the top threads, analyzes their stacks, and returns a diagnostic summary.

Scenario 2: Method Execution Tracing

Traditional command:

watch com.example.OrderService createOrder '{params, returnObj, throwExp}' -n 1 -x 3

Arthas Agent request:

用户:监控一下 OrderService.createOrder 方法,看下入参和返回值

The agent builds a safe watch command with appropriate parameters and formats the output.

Scenario 3: JVM Health Check

Request: “帮我做个 JVM 健康检查”。

The agent sequentially executes:

1. dashboard   # 查看整体状态
2. memory      # 检查内存使用
3. thread      # 分析线程状态
4. jvm         # 获取 JVM 参数

and produces a complete health report.

Getting Started with Arthas Agent

Prerequisites

Java 8+ runtime.

Arthas 4.x installed.

An AI client that supports MCP (e.g., Claude Desktop, Cursor).

Start the Arthas MCP Server

# Download the latest Arthas boot jar
curl -O https://arthas.aliyun.com/arthas-boot.jar

# Launch with MCP support
java -jar arthas-boot.jar --mcp

Configure the AI Client

{
  "mcpServers": {
    "arthas": {
      "url": "http://localhost:8563/mcp"
    }
  }
}

Prompt Best Practices

Use clear patterns when speaking to the agent:

CPU分析: "帮我看下 CPU 最高的 3 个线程在干什么"
方法追踪: "监控 XxxService.xxxMethod 方法,看下入参和返回值"
类搜索: "帮我找一下 UserService 这个类"
Bean 查询: "查一下 Spring 容器里的 xxxBean"
健康检查: "做一个 JVM 健康检查"

高级示例:
用户:监控 OrderService.createOrder 方法,只看 orderId 包含 "TEST" 的请求,记录入参、返回值和执行耗时

Important Notes

Use cautiously in production; verify in a test environment first.

Avoid high‑frequency sampling; the default -n 1 is recommended.

Watch output may contain sensitive data; apply redaction as needed.

Remember to stop or exit Arthas after diagnostics are complete.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendAIMCPdiagnosticsArthas
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.