Artificial Intelligence 19 min read

Should You Build the Agent Framework First, Then Fine‑Tune System Prompts?

The article explains what a System Prompt is, how it differs from User Prompts, its role in LLM APIs, caching benefits, common pitfalls, and best‑practice designs across Claude Code, Cursor, Codex CLI, and Gemini CLI, ending with testing and version‑control recommendations.

AI Engineer Programming

May 12, 2026

Should You Build the Agent Framework First, Then Fine‑Tune System Prompts?

Why build the agent framework first?

AI‑assisted coding makes creating AI agents increasingly easy; with tools like Vibe Coding you pick a model, attach a few tools, write a bit of glue code, and an agent runs. Many developers instinctively set up the framework before gradually tuning prompts, while using AI to generate prompts is also common.

What a System Prompt actually does

When calling an LLM API the payload is a list of structured messages, each with a role field. The System Prompt is the first message (role "system") and is read before any other token.

<s>[INST] <<SYS>>
你是xxxx有帮助的助手
<</SYS>>
用户问题 [/INST]

ChatML 系:
<|im_start|>system
...
<|im_end|>
<|im_start|>user
...

In the Transformer architecture each new token attends to the whole context, so the System Prompt is naturally part of the attention window. However, position alone does not grant extra importance; attention weights are determined by content relevance, and relative‑position encodings (RoPE) give no special privilege to early tokens. Long‑context models often lose information in the middle of the sequence.

At the engineering level the inference engine locks the KV‑Cache for the System Prompt; when the context window is trimmed, this prefix is kept while later history may be discarded. During instruction‑fine‑tuning the model learns to treat the System Prompt as a persistent meta‑instruction.

KV Cache and Prompt Caching

Modern LLM inference engines (vLLM, SGLang, TensorRT‑LLM) cache the key‑value tensors of the attention layers. If multiple requests share the same prefix, the cached computation can be reused, making the System Prompt a natural, stable cacheable prefix.

System Prompt vs. User Prompt

Think of the System Prompt as a constitution and the User Prompt as ordinary law: the constitution exists first and cannot be overridden, while laws must comply with it. The System Prompt stays unchanged throughout a session (unless explicitly updated); the User Prompt grows with each turn, and its rules only apply to the current round.

Caching strategy

Because the System Prompt is identical for every request, inference frameworks can pre‑compute and lock its KV‑Cache, reusing it across calls. The growing User Prompt forms the longest common prefix for the next request, which can also be cached, while compression typically preserves the System Prompt and trims only the middle history.

Not just a configuration item

The System Prompt influences an agent’s theoretical limits. Without constraints, mainstream models exhibit systematic tendencies such as emitting code comments as output, defaulting to serial tool calls, producing verbose replies, or over‑explaining instead of executing. These stem from training‑data distributions. Leading coding agents (Claude Code, Cursor, Codex, Gemini) devote substantial System Prompt space to counter these biases.

Common anti‑patterns

Hard‑coding dynamic session state, timestamps, or file paths into the System Prompt, causing cache invalidation.

Over‑constraining with hundreds of fine‑grained rules, which can confuse the model and increase conflict probability.

Conflicting instructions from different sources (e.g., System Prompt says “confirm on ambiguity” while User Prompt says “execute directly”).

Using a single System Prompt for all scenarios and models.

Best‑practice separation

System Prompt should contain role definition, global rules, tool constraints, boundaries, and persistent intent.

User Prompt should carry the specific task, current context, dynamic parameters, and user preferences.

System Prompt as a dynamically assembled system

In practice a System Prompt consists of over a hundred independent components that are conditionally loaded based on the runtime environment.

Resident components : role definition, system rules, task execution specs, tool usage specs – always present.

Conditional components : Git snapshot (only in a repository), language settings (only when a preference is configured), MCP server commands (re‑computed each round), user‑defined output style, etc.

Version variants : the same component may have different versions for internal vs. external users or for CLI vs. UI modes.

Designing a System Prompt is therefore akin to writing software: it requires version management, conditional logic, and ongoing maintenance.

Claude Code – modular dynamic assembly

Claude Code treats each instruction as an independent software component, controlled by conditional logic. It distinguishes resident (“task execution spec”, “tool usage spec”) from conditional components (Git snapshot, language settings). The prompt includes explicit cache‑boundary markers so that stable content can be reused while volatile parts stay outside the cache. The SDK offers three customization modes: append to add to the base prompt, preset: "claude_code" to select a preset, and settingSources: ["project"] to require an explicit source declaration.

Cursor – model‑aware multi‑version prompts

Cursor maintains separate instruction sets for each supported model. Different models may react very differently to the same prompt (e.g., a model trained on shell workflows prefers grep over a dedicated search tool). Rules are stored under .cursor/rules/, each markdown file representing a rule set, with a paths front‑matter field to limit scope to specific file patterns, providing finer granularity than a single System Prompt file.

Codex CLI – hierarchical inheritance

Codex CLI extracts persistent directives from the System Prompt into an AGENTS.md hierarchy:

~/.codex/AGENTS.md          # global defaults, inherited by all projects
├── repo-root/AGENTS.md    # project‑level overrides
│   └── services/payments/AGENTS.override.md  # sub‑directory highest priority

During runtime the files are merged from root to the current directory, with nearer files taking precedence. The AGENTS.md file lives in version‑controlled Git alongside code, enabling clear change history, changelogs, and peer review.

Gemini CLI – fully replaceable + dynamic variable injection

Gemini CLI exposes the environment variable GEMINI_SYSTEM_MD; users can supply a complete markdown file that completely replaces the default System Prompt. The file may contain placeholders such as ${AgentSkills}, ${SubAgents}, ${AvailableTools}, which are substituted at runtime, making the prompt auditable and forkable.

Conclusion

Before starting agent development, clarify the core task, boundaries, desired communication style, and high‑risk operations that need confirmation. Decide which elements belong in the static System Prompt (role, behavior rules, tool constraints, communication style, permission boundaries) and which should be injected dynamically (working directory, platform info, Git status, server lists, user preferences, session‑specific state).

Adopt version‑controlled, layered management of System Prompts, record changelogs, and subject every change to review. Test changes with regression suites, A/B comparisons, and model‑migration checks to ensure that updates do not introduce regressions or unexpected path changes. Remember that not all models support a separate System Prompt; some may merge it with the User Prompt or reject it entirely.

References

https://www.dbreunig.com/2026/04/04/how-claude-code-builds-a-system-prompt.html

https://www.dbreunig.com/2026/02/10/system-prompts-define-the-agent-as-much-as-the-model.html

https://github.com/Piebald-AI/claude-code-system-prompts

https://platform.claude.com/docs/en/agent-sdk/modifying-system-prompts

https://cursor.com/blog/agent-best-practices

https://developers.openai.com/cookbook/examples/gpt-5/codex_prompting_guide

https://developers.openai.com/codex/guides/agents-md

https://geminicli.com/docs/cli/system-prompt/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cache AI Agents LLM Prompt Engineering Cursor Claude Code Codex CLI System Prompt

Written by

AI Engineer Programming

In the AI era, defining problems is often more important than solving them; here we explore AI's contradictions, boundaries, and possibilities.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.