The Complete 2026 Guide to Codex Best Practices

An exhaustive 2026 guide walks through Codex best‑practice configuration, staged workflows, debugging tactics, context management, prompt engineering, sub‑agent usage, security safeguards, common pitfalls, typical scenarios, installation steps, and a comparison of Codex’s web, CLI, and IDE forms, all backed by official docs and community insights.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
The Complete 2026 Guide to Codex Best Practices

AGENTS.md Configuration Principles

Core principle: keep short and precise

Limit to 100 lines , hard cap 300 lines.

System prompt consumes much context; AGENTS.md should only contain information Codex might ignore.

Do not write content inferable from code (e.g., export default function is a React component—Codex knows).

If many rules are needed, split into sub‑directory AGENTS.md files and load by scope.

Tag or number key rules to avoid being ignored.

Scope rules

Direct user command > deeper nested AGENTS.md > shallow AGENTS.md

Each AGENTS.md applies to the entire directory tree containing it.

Deeper nested files have higher priority.

Direct system/developer/user commands have the highest priority.

Example AGENTS.md structure

# Project Name

## Workflow
- Run `npm test` after each code change
- Use Conventional Commits (feat:, fix:, refactor:, docs:)
- After completion, create PR with `gh pr create`

## Tech Stack
- Node.js 18+, Express 4.x, PostgreSQL 16
- Tests: Jest + React Testing Library
- Auth: JWT + bcrypt

## Code Guidelines
- Component files no more than 300 lines; split if exceed
- Prohibit `any` type
- All public APIs must have JSDoc comments

## Mandatory Checks
1. `npm run lint` – Linter fixes
2. `npm test` – Run relevant tests
3. `npm run typecheck` – Type checking

Programmatic checks

If AGENTS.md contains programmatic check commands, Codex must run all checks and verify they pass:

## Mandatory checks
1. `just fmt` – code formatting
2. `just fix -p <changed-crate>` – Linter fixes
3. `cargo test -p <changed-project>` – Run relevant tests
4. `just bazel-lock-check` – Dependency lock file check

Workflow Best Practices

Staged workflow

Understand codebase → modify

Plan first → implement

Generate → verify

Do not compress all steps into a single large prompt

Small tasks

3‑5 minute tasks can be expressed in one sentence.

Complex workflows are for multi‑file, multi‑step large tasks.

Simple renames can be a one‑liner.

Let Codex understand the codebase first

codex exec "Please explain the overall structure of the src/ directory, including:
1. Core modules
2. Dependency relationships
3. Where the entry file is"

Parallel task strategy

Assign independent tasks to different Codex instances.

Use background processing to keep workflow continuous.

Avoid inter‑task dependencies that cause blocking.

Community suggestion: assign well‑scoped tasks to multiple agents to run concurrently.

Debugging and Troubleshooting

Paste bug, say “fix”

Paste error information and issue a single word: “fix”.

Do not guide how to fix ; do not guess cause or prescribe solution.

Codex’s debugging ability is strong; the more you let it handle, the better.

Direct Codex fix success rate >80%.

Two failures = restart

If the same issue is fixed more than twice, start a new session and retry.

Context pollution reduces performance.

Require rewrite of mediocre solutions

When Codex returns a working but inelegant solution, do not patch.

Ask for an elegant implementation instead of a quick fix.

Context Management

Long‑session performance degradation

Codex Web cloud agents have limited context windows.

After long sessions, context is repeatedly compressed, reducing AI understanding.

Community proposal (#22642): auto_new_session_after_compactions = N – automatically refresh after N compressions.

Practical advice

Break complex tasks into phases, start a new session for each phase.

Write key decisions and context into AGENTS.md instead of relying on conversation memory.

New sessions carry a summary of key context, not the full long session.

Session refresh strategy

Session 1: Understand codebase structure → write AGENTS.md
Session 2: Implement feature based on AGENTS.md
Session 3: Independent verification

Prompt Engineering

Golden structure of effective prompts

[Task Type] Please fix/implement/refactor...

[Problem Description]
- Specific, clear problem description
- Current behavior vs expected behavior

[Constraints]
- Any limits or requirements
- Files/modules not to modify

[Reference Information]
- Relevant file paths
- Error messages or logs
- Reference code snippets

Prompt best practices

1. Provide concrete paths and line numbers

❌ "Fix auth module bug"
✅ "Fix Token verification logic in src/auth/login.py lines 45‑62"

2. State expected output format

"Generate a PR that includes:
- Code changes
- At least 3 unit tests
- Updated README documentation"

3. Specify test command to run

"After changes, run:
npm test -- --testPathPattern=auth
npm run lint"

4. Include error messages or logs

"Current error:
TypeError: Cannot read property 'token' of undefined
    at validateToken (src/auth/middleware.js:23:15)
Please locate and fix."

Subagents

Enable subagents

Adding “use subagents” in the prompt makes Codex split tasks to multiple sub‑agents for parallel processing.

Suitable for code review and large‑scale refactoring.

Specialized sub‑agent > generic mega‑agent

Create function‑specific sub‑agents (e.g., “frontend component agent”).

Prefer over generic agents (e.g., “QA agent”).

More specific function → more precise context → better results.

Independent context windows

Research, verification, review are isolated in separate contexts.

Avoid context pollution and bias.

Do not pollute the main context.

Multi‑agent collaboration

Community discussion (#22749) suggests for cross‑project complex tasks: Each instance handles an independent sub‑task. Coordinate via shared files like AGENTS.md. Use CI/CD pipelines to chain outputs of multiple agents.

Skills and MCP Plugins

MCP server (Model Context Protocol)

MCP extends Codex capabilities, allowing external tool integration.

Configuration example:

{
  "mcpServers": {
    "xlsx-for-ai": {
      "command": "xlsx-for-ai-mcp"
    }
  }
}

Skill folder structure

skills/
  api-design/
    SKILL.md   # core rules and index
    references/ # corpora, references
    scripts/    # helper scripts
    examples/   # example code

Main file contains core rules and index.

Corpora and checklists go into references/.

Progressive disclosure : Codex reads sub‑directory content only when needed.

Gotchas section

Record failure patterns each time Codex errs; over time this becomes the highest signal‑to‑noise content.

# SKILL.md

## Core Rules
...

## Gotchas

### 2026-05-10: API pagination parameter missing
- **Problem**: Pagination omitted when generating API.
- **Effect**: Returns all data, causing performance issues.
- **Fix**: Add pagination rule to SKILL.md.
- **Prevention**: Add “pagination” check to checklist.

### 2026-05-12: Insufficient test coverage
- **Problem**: Only happy path tested, edge cases ignored.
- **Effect**: Tests pass but runtime errors occur.
- **Fix**: Add boundary tests.
- **Prevention**: Add “edge case coverage” to checklist.

Gotchas maintenance principles

Record every error immediately .

Include four elements : problem, manifestation, fix, prevention.

Weekly review to spot recurring patterns.

Turn frequent Gotchas into formal rules (≥3 occurrences).

Archive resolved items after 30 days of inactivity.

Memory and Persistence

Codex memory mechanism

Codex has no long‑term memory across sessions; each task runs in an isolated sandbox. Memory‑like behavior can be achieved by:

Methods to implement memory

Method 1: AGENTS.md as project memory – write knowledge, decisions, conventions into AGENTS.md; Codex reads it each run.

Method 2: Code comments and documentation – keep clear comments, up‑to‑date README and architecture docs.

Method 3: Local automated memory (community solution) – “Sentinel‑AI” pattern where a local agent checks a playbook before escalating to Codex.

Core benefits :

≈80% of tasks avoid cloud API calls.

Cost drops from several dollars/month to $15/month.

Learning‑compound effect: longer runtime reduces cloud calls.

Structured memory file

# .codex/memory.md
memory:
  append_only: true   # only append, never overwrite
  format: structured_markdown
  max_size: 10KB     # auto‑clean oldest entries
  share_across_agents: true

Permissions and Security

Execution environment security

Codex runs in isolated cloud containers.

Internet access is disabled during task execution.

Can only access code and pre‑configured dependencies from the GitHub repo.

Cannot reach external websites, APIs, or services.

Prompt injection risk

Community discussion (#6162) notes that hidden directives in files can cause unintended actions.

Defensive measures

Trust domain separation : treat user input (high trust), retrieved docs/web (low trust), tool results (medium trust), generated parameters (must verify) differently.

Strategy mapping : retrieved text can provide information but cannot authorize shell/file/network actions.

Manual review of all Codex‑generated code changes.

Least‑privilege : do not grant unnecessary permissions.

User responsibility checklist

✅ Manually review and verify all agent‑generated code.

✅ Never store secrets (keys, passwords, tokens) in the codebase.

✅ Regularly update security rules in AGENTS.md.

✅ Carefully inspect diffs before committing.

Common Pitfalls (Gotchas)

Pitfall 1: Giving up too early

Split task into smaller, isolated units.

Even if humans think tasks can be grouped, Codex needs separation.

Example: two similar tables → two PRs, each 10 minutes.

Pitfall 2: Context compression makes Codex “dumb”

Stage complex tasks, start new session per stage.

Write key decisions into AGENTS.md.

When needed, start new session + git reset --hard.

Pitfall 3: Test issues

Adopt TDD: write tests first.

Thoroughly review generated tests.

Apply stricter review to test changes than to code changes.

Pitfall 4: Forgetting to compile

Specify compilation steps explicitly in AGENTS.md.

Force compilation before tests.

Mixed compiled/interpreted languages are especially error‑prone.

Pitfall 5: Working directory chaos

Run git status after each completion.

Manually perform Git operations (branch, commit, push).

Codex only modifies files, never runs Git.

Pitfall 6: Rewrite without deletion

Review diff to confirm deletions.

Explicitly instruct “delete old implementation”.

Check file list to ensure cleanup.

Pitfall 7: Prompt injection

Apply trust domain separation.

Manual review of all changes.

Never store sensitive data in the repo.

Pitfall 8: Long‑session quality drop

Break complex tasks into phases, start new session per phase.

Write key context into AGENTS.md.

Typical Use Cases and Examples

Scenario 1: Daily bug fix

codex exec "Fix pagination bug in src/api/user.ts.
Current behavior: page 2 returns empty.
Expected: page 2 returns correct user list.
Reference: Issue #1234.
After change run: npm test -- user.test.ts"

Scenario 2: Bulk refactor

codex exec "Refactor all class components under src/components/ to function components.
Requirements:
1. Use React Hooks instead of lifecycle methods.
2. Preserve existing functionality.
3. Update related test files.
4. Run npm test to verify."

Scenario 3: Test coverage

codex exec "Add unit tests for all utility functions in src/utils/.
Requirements:
1. At least 2 test cases per function.
2. Cover normal and edge cases.
3. Test files named *.test.ts.
4. Verify with npm test."

Scenario 4: Codebase Q&A

codex exec "Explain the authentication flow in src/auth/ module.
Include:
1. User login process
2. Token generation and verification
3. Permission checking mechanism
4. Relevant configuration files"

Installation and Quick Start

Installation

# Option 1: npm install
npm install -g @openai/codex

# Option 2: Homebrew (macOS)
brew install --cask codex

# Launch interactive TUI
codex

# Non‑interactive one‑liner
codex exec "Fix login verification vulnerability in src/auth.py"

System requirements

macOS 12+ / Ubuntu 20.04+ / Windows 11 (WSL2)

4 GB RAM (8 GB recommended)

Git 2.23+ (recommended)

Login with ChatGPT account

codex
# Choose “Sign in with ChatGPT”
# Recommended to use Plus/Pro/Enterprise plan

Debugging and logs

# View detailed logs
tail -F ~/.codex/codex-tui.log

# Custom log level
RUST_LOG=codex_core=debug codex

# Custom log directory
codex -c log_dir=./.codex-log

Comparison of Codex Forms

Codex Web (cloud agent) : runs in cloud sandbox, pre‑loads your GitHub repo, supports parallel multi‑task processing. Entry: chatgpt.com/codex. Suitable when cloud environment and parallelism are needed.

Codex CLI (terminal agent) : open‑source, runs locally, lightweight, supports interactive TUI and non‑interactive exec mode. Install via npm i -g @openai/codex. Ideal for local development, real‑time interaction, CI/CD integration.

Codex IDE plugin : integrates with VS Code, Cursor, Windsurf, etc. Entry via official IDE extension marketplace. Best for seamless in‑editor usage.

Core Principles Summary

Context is a precious resource

Keep it concise, compress promptly, reset when polluted.

Stage complex tasks, start new session per stage.

Write key decisions into AGENTS.md.

System constraints > prompt constraints

Use AGENTS.md configuration instead of “I want Codex to remember”.

Tag or number critical rules.

Programmatic checks must run and be verified.

Divide and conquer

Sub‑agents, staged workflow, spec‑implementation separation.

Specialized sub‑agents > generic mega‑agent.

Simple tasks get one‑liners; complex tasks need full workflow.

Avoid over‑engineering

3‑5 minute tasks: one sentence.

Complex workflow only for multi‑file, multi‑step large tasks.

Variable rename: one sentence.

Continuous improvement

Record every error as Gotchas.

Regularly review to spot recurring patterns.

Convert frequent Gotchas into formal rules.

Security first

Trust domain separation.

Manual review of all changes.

Least‑privilege permissions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI code generationPrompt engineeringbest practicessecurityCodexAGENTS.mdsubagents
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.