Artificial Intelligence 45 min read

Master AI Coding: From Token Mechanics to Practical Best Practices

This comprehensive guide explains the underlying principles of AI coding assistants—including token calculation, tool calling, codebase indexing, and Merkle tree synchronization—while offering actionable best‑practice recommendations for prompt engineering, incremental development, documentation, security compliance, and effective team adoption.

DaTaobao Tech

Oct 29, 2025

Master AI Coding: From Token Mechanics to Practical Best Practices

Introduction

This article systematically shares how to use AI programming tools efficiently, covering underlying mechanisms (such as token calculation, tool calls, codebase indexing, and Merkle trees), methods to improve conversation quality (rules, progressive development), practical scenarios (code search, diagram generation, problem diagnosis), and recommended coding best practices (documentation, naming, security compliance) for developers of all experience levels.

2.1 Token Calculation Mechanism

Token usage is calculated as follows: Initial Tokens = SystemPrompt + User Question + Rules + Conversation History . The user question includes the entered text and any added context (images, project directory, files). Tool Call Tokens add the results of each tool invocation, so Total Tokens = Initial Input + All Tool Call Results .

Initial Token composition:
SystemPrompt + User Question + Rules + Conversation History

2.1.2 Example

Scenario: The user pastes a code snippet and an image, asking "What is wrong with this function?" The AI calls tools to analyze the code.

Initial Tokens = 500 (SystemPrompt) + 200 (User Question) + 800 (Rules) + 300 (Conversation History) = 1800 tokens
Tool Call Results = 2000 (File Content) + 1500 (Search Results) + 300 (Lint Results) = 3800 tokens
Total Tokens = 5600 tokens

2.2 Tool Calls

AI assistants use various tools to interact with the codebase:

read_file : Reads file content with optional line ranges.

codebase_search : Performs semantic search across the repository.

edit_file : Proposes precise edits using placeholder comments for unchanged code.

list_dir , grep_search , file_search , delete_file , run_terminal_cmd , web_search , diff_history : Additional utilities for file management and web queries.

<span>You are a powerful agentic AI coding assistant, powered by [some model]...</span>

2.3 Codebase Retrieval

Cursor indexes the entire codebase by chunking files into meaningful fragments (functions, classes) and converting them into vector embeddings stored in a vector database. Queries are also vectorized, enabling fast similarity search to retrieve relevant code snippets.

Steps:
1. Sync workspace files to Cursor server.
2. Split files into logical chunks.
3. Convert chunks to vectors.
4. Store vectors in a database.
5. Perform similarity search on query vectors.
6. Return matching file paths and line numbers.

Merkle Tree

Merkle trees provide efficient verification and incremental synchronization by hashing file contents. Changes are detected by comparing hashes, allowing only modified files to be uploaded.

Merkle Tree benefits:
- O(log n) verification.
- Detects data tampering.
- Supports incremental sync.

2.4 Prompt for LLM

The prompt includes system instructions, proactive behavior guidelines, tool usage policies, and task management rules. It emphasizes concise responses, security best practices, and adherence to project-specific conventions.

You are Claude Code, Anthropic's official CLI for Claude. Follow the instructions below and the available tools to assist the user.

2.5 Claude Code CLI Basics

Due to Claude being unavailable in some regions, alternatives like Qwen3‑Coder can be used. Environment variables replace the provider endpoint and API key.

export ANTHROPIC_BASE_URL=https://dashscope.aliyuncs.com/api/v2/apps/claude-code-proxy
export ANTHROPIC_AUTH_TOKEN="your_key"

2.6 Improving Conversation Quality

Key actions:

Clear problem description: Specify function, file, and module names to help the model retrieve relevant code.

Manage context length: Monitor token usage (e.g., 18% of the window) and use larger context models when needed.

Use revert or new conversations: Start fresh dialogs for unrelated topics.

Provide diverse information: Include code, images, web links, and repository history.

3 Practical Applications

3.1 Quick Project Familiarization & Natural Language Code Search

Examples of queries:

Explain each module and show dependency graph.

Identify where to place new functionality.

Search for specific feature implementation and provide code snippet.

3.2 Diagram Generation (PlantUML / Mermaid)

Provide UI flow screenshots and requirements; the AI generates accurate diagrams.

3.3 Problem Diagnosis

When unfamiliar with a repository, the AI can read relevant files, analyze code, and suggest fixes.

3.4 Adding Web Information to Context

Use @Web or paste links to let the model summarize external articles.

4 Recommendations

4.1 Rule Creation and Optimization

Define project‑specific rules (technology stack, coding standards) in a .cursor directory. Types include Always, Auto Attached, Agent Requested, and Manual.

Project Rule Example:
Always include: "Use Java 8, Maven multi‑module project."
Auto Attached: "Include when files match *.java"
Manual: "@ruleName" to insert explicitly.

4.2 Documentation

Maintain essential docs in the repository root:

README.md : Overview, features, quick start.

CHANGELOG.md : Release notes.

ARCHITECTURE.md : Architecture diagram, module breakdown.

4.3 Comments and Naming

Provide clear method and parameter documentation.

Add usage scenarios and inline comments.

Use unambiguous names.

Mark AI‑generated files with @author AI Assistant when >80% content is AI‑written.

4.4 Security and Compliance

Follow company policies for AI usage, ensuring data privacy and compliance.

4.5 Promotion

Share internal repositories with clearly defined project rules, user rules, and documentation to enable consistent AI assistance across teams.

References

"Cursor 实战万字经验分享，与 AI 编码的深度思考" – https://www.cnblogs.com/echolun/p/18965624

"Claude Code（及 cursor）内部工作原理窥探" – https://www.superlinear.academy/c/share-your-projects/claude-code-cursor

AI coding Tool Calling codebase indexing token calculation

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.