The Complete 2026 Guide to Codex Best Practices
An exhaustive 2026 guide walks through Codex best‑practice configuration, staged workflows, debugging tactics, context management, prompt engineering, sub‑agent usage, security safeguards, common pitfalls, typical scenarios, installation steps, and a comparison of Codex’s web, CLI, and IDE forms, all backed by official docs and community insights.
AGENTS.md Configuration Principles
Core principle: keep short and precise
Limit to 100 lines , hard cap 300 lines.
System prompt consumes much context; AGENTS.md should only contain information Codex might ignore.
Do not write content inferable from code (e.g., export default function is a React component—Codex knows).
If many rules are needed, split into sub‑directory AGENTS.md files and load by scope.
Tag or number key rules to avoid being ignored.
Scope rules
Direct user command > deeper nested AGENTS.md > shallow AGENTS.mdEach AGENTS.md applies to the entire directory tree containing it.
Deeper nested files have higher priority.
Direct system/developer/user commands have the highest priority.
Example AGENTS.md structure
# Project Name
## Workflow
- Run `npm test` after each code change
- Use Conventional Commits (feat:, fix:, refactor:, docs:)
- After completion, create PR with `gh pr create`
## Tech Stack
- Node.js 18+, Express 4.x, PostgreSQL 16
- Tests: Jest + React Testing Library
- Auth: JWT + bcrypt
## Code Guidelines
- Component files no more than 300 lines; split if exceed
- Prohibit `any` type
- All public APIs must have JSDoc comments
## Mandatory Checks
1. `npm run lint` – Linter fixes
2. `npm test` – Run relevant tests
3. `npm run typecheck` – Type checkingProgrammatic checks
If AGENTS.md contains programmatic check commands, Codex must run all checks and verify they pass:
## Mandatory checks
1. `just fmt` – code formatting
2. `just fix -p <changed-crate>` – Linter fixes
3. `cargo test -p <changed-project>` – Run relevant tests
4. `just bazel-lock-check` – Dependency lock file checkWorkflow Best Practices
Staged workflow
Understand codebase → modify
Plan first → implement
Generate → verify
Do not compress all steps into a single large prompt
Small tasks
3‑5 minute tasks can be expressed in one sentence.
Complex workflows are for multi‑file, multi‑step large tasks.
Simple renames can be a one‑liner.
Let Codex understand the codebase first
codex exec "Please explain the overall structure of the src/ directory, including:
1. Core modules
2. Dependency relationships
3. Where the entry file is"Parallel task strategy
Assign independent tasks to different Codex instances.
Use background processing to keep workflow continuous.
Avoid inter‑task dependencies that cause blocking.
Community suggestion: assign well‑scoped tasks to multiple agents to run concurrently.
Debugging and Troubleshooting
Paste bug, say “fix”
Paste error information and issue a single word: “fix”.
Do not guide how to fix ; do not guess cause or prescribe solution.
Codex’s debugging ability is strong; the more you let it handle, the better.
Direct Codex fix success rate >80%.
Two failures = restart
If the same issue is fixed more than twice, start a new session and retry.
Context pollution reduces performance.
Require rewrite of mediocre solutions
When Codex returns a working but inelegant solution, do not patch.
Ask for an elegant implementation instead of a quick fix.
Context Management
Long‑session performance degradation
Codex Web cloud agents have limited context windows.
After long sessions, context is repeatedly compressed, reducing AI understanding.
Community proposal (#22642): auto_new_session_after_compactions = N – automatically refresh after N compressions.
Practical advice
Break complex tasks into phases, start a new session for each phase.
Write key decisions and context into AGENTS.md instead of relying on conversation memory.
New sessions carry a summary of key context, not the full long session.
Session refresh strategy
Session 1: Understand codebase structure → write AGENTS.md
Session 2: Implement feature based on AGENTS.md
Session 3: Independent verificationPrompt Engineering
Golden structure of effective prompts
[Task Type] Please fix/implement/refactor...
[Problem Description]
- Specific, clear problem description
- Current behavior vs expected behavior
[Constraints]
- Any limits or requirements
- Files/modules not to modify
[Reference Information]
- Relevant file paths
- Error messages or logs
- Reference code snippetsPrompt best practices
1. Provide concrete paths and line numbers
❌ "Fix auth module bug"
✅ "Fix Token verification logic in src/auth/login.py lines 45‑62"2. State expected output format
"Generate a PR that includes:
- Code changes
- At least 3 unit tests
- Updated README documentation"3. Specify test command to run
"After changes, run:
npm test -- --testPathPattern=auth
npm run lint"4. Include error messages or logs
"Current error:
TypeError: Cannot read property 'token' of undefined
at validateToken (src/auth/middleware.js:23:15)
Please locate and fix."Subagents
Enable subagents
Adding “use subagents” in the prompt makes Codex split tasks to multiple sub‑agents for parallel processing.
Suitable for code review and large‑scale refactoring.
Specialized sub‑agent > generic mega‑agent
Create function‑specific sub‑agents (e.g., “frontend component agent”).
Prefer over generic agents (e.g., “QA agent”).
More specific function → more precise context → better results.
Independent context windows
Research, verification, review are isolated in separate contexts.
Avoid context pollution and bias.
Do not pollute the main context.
Multi‑agent collaboration
Community discussion (#22749) suggests for cross‑project complex tasks: Each instance handles an independent sub‑task. Coordinate via shared files like AGENTS.md. Use CI/CD pipelines to chain outputs of multiple agents.
Skills and MCP Plugins
MCP server (Model Context Protocol)
MCP extends Codex capabilities, allowing external tool integration.
Configuration example:
{
"mcpServers": {
"xlsx-for-ai": {
"command": "xlsx-for-ai-mcp"
}
}
}Skill folder structure
skills/
api-design/
SKILL.md # core rules and index
references/ # corpora, references
scripts/ # helper scripts
examples/ # example codeMain file contains core rules and index.
Corpora and checklists go into references/.
Progressive disclosure : Codex reads sub‑directory content only when needed.
Gotchas section
Record failure patterns each time Codex errs; over time this becomes the highest signal‑to‑noise content.
# SKILL.md
## Core Rules
...
## Gotchas
### 2026-05-10: API pagination parameter missing
- **Problem**: Pagination omitted when generating API.
- **Effect**: Returns all data, causing performance issues.
- **Fix**: Add pagination rule to SKILL.md.
- **Prevention**: Add “pagination” check to checklist.
### 2026-05-12: Insufficient test coverage
- **Problem**: Only happy path tested, edge cases ignored.
- **Effect**: Tests pass but runtime errors occur.
- **Fix**: Add boundary tests.
- **Prevention**: Add “edge case coverage” to checklist.Gotchas maintenance principles
Record every error immediately .
Include four elements : problem, manifestation, fix, prevention.
Weekly review to spot recurring patterns.
Turn frequent Gotchas into formal rules (≥3 occurrences).
Archive resolved items after 30 days of inactivity.
Memory and Persistence
Codex memory mechanism
Codex has no long‑term memory across sessions; each task runs in an isolated sandbox. Memory‑like behavior can be achieved by:
Methods to implement memory
Method 1: AGENTS.md as project memory – write knowledge, decisions, conventions into AGENTS.md; Codex reads it each run.
Method 2: Code comments and documentation – keep clear comments, up‑to‑date README and architecture docs.
Method 3: Local automated memory (community solution) – “Sentinel‑AI” pattern where a local agent checks a playbook before escalating to Codex.
Core benefits :
≈80% of tasks avoid cloud API calls.
Cost drops from several dollars/month to $15/month.
Learning‑compound effect: longer runtime reduces cloud calls.
Structured memory file
# .codex/memory.md
memory:
append_only: true # only append, never overwrite
format: structured_markdown
max_size: 10KB # auto‑clean oldest entries
share_across_agents: truePermissions and Security
Execution environment security
Codex runs in isolated cloud containers.
Internet access is disabled during task execution.
Can only access code and pre‑configured dependencies from the GitHub repo.
Cannot reach external websites, APIs, or services.
Prompt injection risk
Community discussion (#6162) notes that hidden directives in files can cause unintended actions.
Defensive measures
Trust domain separation : treat user input (high trust), retrieved docs/web (low trust), tool results (medium trust), generated parameters (must verify) differently.
Strategy mapping : retrieved text can provide information but cannot authorize shell/file/network actions.
Manual review of all Codex‑generated code changes.
Least‑privilege : do not grant unnecessary permissions.
User responsibility checklist
✅ Manually review and verify all agent‑generated code.
✅ Never store secrets (keys, passwords, tokens) in the codebase.
✅ Regularly update security rules in AGENTS.md.
✅ Carefully inspect diffs before committing.
Common Pitfalls (Gotchas)
Pitfall 1: Giving up too early
Split task into smaller, isolated units.
Even if humans think tasks can be grouped, Codex needs separation.
Example: two similar tables → two PRs, each 10 minutes.
Pitfall 2: Context compression makes Codex “dumb”
Stage complex tasks, start new session per stage.
Write key decisions into AGENTS.md.
When needed, start new session + git reset --hard.
Pitfall 3: Test issues
Adopt TDD: write tests first.
Thoroughly review generated tests.
Apply stricter review to test changes than to code changes.
Pitfall 4: Forgetting to compile
Specify compilation steps explicitly in AGENTS.md.
Force compilation before tests.
Mixed compiled/interpreted languages are especially error‑prone.
Pitfall 5: Working directory chaos
Run git status after each completion.
Manually perform Git operations (branch, commit, push).
Codex only modifies files, never runs Git.
Pitfall 6: Rewrite without deletion
Review diff to confirm deletions.
Explicitly instruct “delete old implementation”.
Check file list to ensure cleanup.
Pitfall 7: Prompt injection
Apply trust domain separation.
Manual review of all changes.
Never store sensitive data in the repo.
Pitfall 8: Long‑session quality drop
Break complex tasks into phases, start new session per phase.
Write key context into AGENTS.md.
Typical Use Cases and Examples
Scenario 1: Daily bug fix
codex exec "Fix pagination bug in src/api/user.ts.
Current behavior: page 2 returns empty.
Expected: page 2 returns correct user list.
Reference: Issue #1234.
After change run: npm test -- user.test.ts"Scenario 2: Bulk refactor
codex exec "Refactor all class components under src/components/ to function components.
Requirements:
1. Use React Hooks instead of lifecycle methods.
2. Preserve existing functionality.
3. Update related test files.
4. Run npm test to verify."Scenario 3: Test coverage
codex exec "Add unit tests for all utility functions in src/utils/.
Requirements:
1. At least 2 test cases per function.
2. Cover normal and edge cases.
3. Test files named *.test.ts.
4. Verify with npm test."Scenario 4: Codebase Q&A
codex exec "Explain the authentication flow in src/auth/ module.
Include:
1. User login process
2. Token generation and verification
3. Permission checking mechanism
4. Relevant configuration files"Installation and Quick Start
Installation
# Option 1: npm install
npm install -g @openai/codex
# Option 2: Homebrew (macOS)
brew install --cask codex
# Launch interactive TUI
codex
# Non‑interactive one‑liner
codex exec "Fix login verification vulnerability in src/auth.py"System requirements
macOS 12+ / Ubuntu 20.04+ / Windows 11 (WSL2)
4 GB RAM (8 GB recommended)
Git 2.23+ (recommended)
Login with ChatGPT account
codex
# Choose “Sign in with ChatGPT”
# Recommended to use Plus/Pro/Enterprise planDebugging and logs
# View detailed logs
tail -F ~/.codex/codex-tui.log
# Custom log level
RUST_LOG=codex_core=debug codex
# Custom log directory
codex -c log_dir=./.codex-logComparison of Codex Forms
Codex Web (cloud agent) : runs in cloud sandbox, pre‑loads your GitHub repo, supports parallel multi‑task processing. Entry: chatgpt.com/codex. Suitable when cloud environment and parallelism are needed.
Codex CLI (terminal agent) : open‑source, runs locally, lightweight, supports interactive TUI and non‑interactive exec mode. Install via npm i -g @openai/codex. Ideal for local development, real‑time interaction, CI/CD integration.
Codex IDE plugin : integrates with VS Code, Cursor, Windsurf, etc. Entry via official IDE extension marketplace. Best for seamless in‑editor usage.
Core Principles Summary
Context is a precious resource
Keep it concise, compress promptly, reset when polluted.
Stage complex tasks, start new session per stage.
Write key decisions into AGENTS.md.
System constraints > prompt constraints
Use AGENTS.md configuration instead of “I want Codex to remember”.
Tag or number critical rules.
Programmatic checks must run and be verified.
Divide and conquer
Sub‑agents, staged workflow, spec‑implementation separation.
Specialized sub‑agents > generic mega‑agent.
Simple tasks get one‑liners; complex tasks need full workflow.
Avoid over‑engineering
3‑5 minute tasks: one sentence.
Complex workflow only for multi‑file, multi‑step large tasks.
Variable rename: one sentence.
Continuous improvement
Record every error as Gotchas.
Regularly review to spot recurring patterns.
Convert frequent Gotchas into formal rules.
Security first
Trust domain separation.
Manual review of all changes.
Least‑privilege permissions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
