Why CE’s Agent Design Treats Expert Prompts as Decision Modules, Not Personas

The article explains how many teams instinctively create multiple expert personas for AI agents, but CE instead builds agents as well‑defined judgment modules with clear input and output boundaries, explicit non‑responsibilities, confidence calibration, and systematic orchestration, resulting in a more reliable and maintainable review pipeline.

o-ai.tech
o-ai.tech
o-ai.tech
Why CE’s Agent Design Treats Expert Prompts as Decision Modules, Not Personas

1. CE’s agents are judgment modules, not personas

Most teams start AI‑agent design by "creating a few experts". CE’s repository, however, defines agents as stable decision modules rather than role‑playing characters.

Key structural elements of a basic review agent

Frontmatter defines name, description, model, and tools.

The body specifies the exact thing the agent hunts for.

Explicit fields for confidence calibration and What you don't flag.

Output must follow a fixed JSON schema.

This structure shows that CE views an agent as a deterministic judgment component, not a simulated expert.

Four essential characteristics of CE’s agents

Clear input boundaries : each reviewer focuses on a narrow problem set (e.g., logical errors, boundary conditions, state‑transition bugs, error‑propagation failures).

Clear output boundaries : agents return structured JSON fields such as findings, residual_risks, and testing_gaps so the result can be consumed by an upstream orchestration pipeline.

Explicit "non‑responsibility" list : the What you don't flag section enumerates what the agent deliberately ignores (style preferences, missing optimizations, naming opinions, unnecessary defensive suggestions).

Confidence calibration : outputs are classified as high, medium, or low confidence, enabling the system to route automatically‑handleable items versus those needing human judgment.

Why confidence matters

Without calibrated confidence the system cannot distinguish between automatically‑resolvable findings and cases that require manual review, breaking downstream routing logic.

2. Splitting reviewers by judgment dimension instead of persona

CE’s agents/review/ directory contains multiple specialized reviewers:

correctness‑reviewer : flags off‑by‑one, null/undefined propagation, race conditions, incorrect state transitions, broken error propagation.

testing‑reviewer : checks new branch coverage, asserts that tests only pass without throwing, detects over‑mocking, missing error paths, and ensures test additions match behavior changes.

maintainability‑reviewer : warns about premature abstraction, unnecessary indirection, dead code, tight coupling, and obscured intent, focusing on future maintenance cost.

adversarial‑reviewer : actively constructs failure scenarios (assumption violations, composition failures, cascade constructions, abuse cases) and varies depth (quick, standard, deep) based on diff size and risk signals.

These reviewers are independent lenses; mixing them would cause interference and dilute signal quality.

3. Orchestration of agents

Agents are not scattered files; they are wired into the ce:review orchestration skill:

Always‑on reviewers

correctness

testing

maintainability

project‑standards

agent‑native‑reviewer

learnings‑researcher

Each diff must answer these questions: logical correctness, test health, maintainability trend, project‑standard compliance, agent‑native accessibility, and relevant historical learnings.

Cross‑cutting conditional reviewers

security‑reviewer

performance‑reviewer

api‑contract‑reviewer

data‑migrations‑reviewer

reliability‑reviewer

adversarial‑reviewer

These are added when the diff touches a specific risk domain.

Stack‑specific conditional reviewers

dhh‑rails‑reviewer

kieran‑rails‑reviewer

kieran‑python‑reviewer

kieran‑typescript‑reviewer

julik‑frontend‑races‑reviewer

These handle language or framework‑specific checks.

4. Research and document‑review agents

Not all agents are code reviewers. Example:

plugins/compound-engineering/agents/research/repo-research-analyst.md

builds context by scanning technology, architecture, patterns, documentation, issue conventions, and templates. It operates with scoped invocations such as technology, architecture, patterns, conventions, and issues.

Document‑review agents (e.g., product-lens-reviewer, scope-guardian-reviewer, coherence-reviewer, feasibility-reviewer) focus on plan and documentation quality, asking questions like “right problem?”, “actual outcome?”, “what if we did nothing?”, and “what already exists?”. They illustrate that bad solutions and bad code are distinct failure modes.

5. Design principles distilled from CE

One judgment per agent : separate logical correctness, test quality, maintainability, security, product alignment, etc., to avoid interference.

Explicitly state what the agent does NOT handle : the "What you don't flag" list acts as a noise‑reduction mechanism.

Always perform confidence calibration : without confidence the system cannot route findings correctly.

Output for the system, not just for humans : agents emit structured JSON that downstream skills can merge, deduplicate, and route.

Separate orchestration from expertise : skills decide when, whom, and how many agents to invoke, while agents focus solely on judgment.

CE’s ce-review skill embodies this separation, invoking the appropriate reviewers based on diff characteristics.

Conclusion

Instead of asking "how many expert personas do we need", CE first enumerates the distinct judgment tasks required by the system. By decomposing agents along judgment dimensions, defining clear boundaries, calibrating confidence, and wiring them through a disciplined orchestration layer, the review pipeline becomes predictable, extensible, and less noisy.

CE agent design illustration
CE agent design illustration
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsPrompt Engineeringorchestrationcode review automationconfidence calibrationdecision modules
o-ai.tech
Written by

o-ai.tech

I’ll keep you updated with the latest AI news and tech developments in real time—let’s embrace AI together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.