What Exactly Does Claude Code Send When You Type “Hello”?

The article walks through configuring a custom model in Claude Code, installing the claude‑tap plugin, launching the tool, sending the message “Hello”, and then dissecting the resulting request to reveal token counts, latency, tool list, system prompts, message payload, and a lingering cache issue.

Code Mala Tang
Code Mala Tang
Code Mala Tang
What Exactly Does Claude Code Send When You Type “Hello”?

Setting a custom model in Claude Code

The official Claude Opus 4.7 series is often inaccessible, so the configuration uses the open‑source model inclusionai/ling-2.6-flash:free. The model was announced by Ant Group, is free on OpenRouter until 30 April, and has been validated in frameworks such as Nanobot, Kilo Code, and autonovel for tasks like web‑page generation, document drafting, chat extraction, and long‑form novel writing. Model URL: https://modelscope.cn/models/inclusionAI/Ling-2.6-flash.

Installing the claude-tap plugin locally

The plugin source is hosted at https://github.com/liaohch3/claude-tap and requires a Python environment. Installation uses the uv tool, which manages dependencies and virtual environments:

uv add claude-tap

Configuration details

The settings.json file has the ANTHROPIC_BASE_URL entry temporarily removed; it will be supplied via command‑line proxy later. Two terminal windows are opened with ghostty:

claude-tap --tap-no-launch --tap-target https://openrouter.ai/api --tap-port 8080 --tap-live

and

claude-tap --tap-live --tap-target https://openrouter.ai/api -- --dangerously-skip-permissions --model "inclusionai/ling-2.6-flash:free"

The first command starts the proxy service; the second launches Claude Code with the custom model.

Launching Claude Code and sending “你好”

After the proxy is running, Claude Code is started and the user types the message “你好”. The browser automatically opens a Live viewer page where the request can be inspected.

Inspecting the plugin request information

Input tokens (including the message “你好”): 23849

Request latency: 3.4 seconds

Tools listed on the right side: local operations, workflow management, external access, interaction settings

System prompt is split into four parts (shown in the original screenshots)

The “Message” field contains the user’s “你好” plus additional metadata displayed in the screenshot

The response is the raw model API output; SSE/JSON handling is not examined further

Conclusion and open issue

A caching problem was observed: although input and output token counts align, two cache‑related fields remain empty. The issue appears tied to certain configuration settings and requires further investigation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI modelClaude Codetoken usageclaude-tapLing-2.6-flashrequest inspection
Code Mala Tang
Written by

Code Mala Tang

Read source code together, write articles together, and enjoy spicy hot pot together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.