Artificial Intelligence 15 min read

Mastering AI Agent Tool Management: OpenManus, Gemini CLI & Shopify Sidekick

This article explains how AI agents work, examines OpenManus’s comprehensive tool framework, reviews Gemini CLI’s minimalist tool scheduling and error handling, and discusses Shopify Sidekick’s scaling challenges and Just‑in‑Time instruction strategy, offering practical guidance for building robust, production‑ready agentic systems.

Architecture and Beyond

Sep 27, 2025

Mastering AI Agent Tool Management: OpenManus, Gemini CLI & Shopify Sidekick

In simple terms, an AI Agent is a system that can perceive its environment, plan, and act to achieve a specific goal.

From an architectural perspective it consists of perception modules (state and intent contexts), a reasoning module (LLM), a memory module, and an action module.

1. OpenManus tool management and scheduling

OpenManus places a large emphasis on tools. The core classes are:

1.1 BaseTool

class BaseTool(ABC):
    name: str
    description: str
    parameters: Dict[str, Any]

    @abstractmethod
    async def execute(self, tool_input: Dict[str, Any]) -> ToolResult:
        ...

These fields cover the essential elements of tool management.

1.2 ToolCollection

class ToolCollection:
    def __init__(self):
        self.tool_map: Dict[str, BaseTool] = {}

    def add_tool(self, tool: BaseTool) -> None:
        if tool.name in self.tool_map:
            logger.warning(f"Tool {tool.name} already exists, overwriting")
        self.tool_map[tool.name] = tool

The collection provides O(1) lookup, a unified execute interface, and consistent error handling via ToolFailure.

1.3 Think‑Act loop

async def step(self) -> None:
    await self.think()   # decide which tool to use
    await self.act()     # invoke the tool and collect results

The loop lives in ReActAgent, separating reasoning and execution for better control and debugging.

1.4 Tool selection

async def think(self) -> None:
    response = await self.llm.ask_tool(
        messages=self.memory.get_messages(),
        system_prompts=self.system_prompts,
        available_tools=self.available_tools.get_tool_schemas(),
        tool_choices=self.tool_choices,
    )

The LLM receives the conversation history, system prompts, available tool schemas, and a choice mode, enabling flexible yet controllable selection.

1.5 Batch execution and result handling

async def act(self) -> None:
    if self.tool_calls:
        for tool_call in self.tool_calls:
            observation = await self.execute_tool(tool_call)
            self.memory.add_message(ToolMessage(observation, tool_call.id))

Each tool’s result is stored in memory for the next reasoning step.

1.6 Special‑tool handling

def handle_special_tool(self, tool_call: ToolCall) -> str:
    if tool_call.name == "Terminate":
        self.should_stop = True
        return "Task completed successfully."
    return f"Unknown special tool: {tool_call.name}"

The Terminate tool provides a graceful shutdown; other special tools can be added similarly.

2. Gemini CLI tool management and scheduling

Gemini CLI simply hands all tools to the LLM and lets it decide.

async setTools(): Promise<void> {
    const toolRegistry = this.config.getToolRegistry();
    const toolDeclarations = toolRegistry.getFunctionDeclarations();
    const tools: Tool[] = [{ functionDeclarations: toolDeclarations }];
    this.getChat().setTools(tools);
}

The design trusts the LLM’s judgment but may cause tool “explosion”.

2.1 Three‑layer tool discovery

Core built‑in tools

File operations : ls, read‑file, write‑file, edit

Code search : grep, ripgrep, glob

System interaction : shell

Network requests : web‑fetch, web‑search

Memory management : memory

Command‑line discovery

async discoverAndRegisterToolsFromCommand(): Promise<void> {
    const command = this.config.getConfigOptions().toolDiscoveryCommand;
    if (!command) return;
    const result = await exec(command);
    const functions = JSON.parse(result.stdout);
    for (const func of functions) {
        this.registerTool(new DiscoveredTool(func));
    }
}

Any executable that outputs a JSON declaration can be turned into a tool.

MCP server integration

stdio, SSE, HTTP, WebSocket

MCP (Model Context Protocol) lets the CLI connect to external services such as databases or specialized APIs, keeping tool namespaces isolated.

2.2 Scheduler state machine

validating → scheduled → awaiting_approval → executing → success/error/cancelled

Parameters are validated before execution, and different approval modes (DEFAULT, AUTO_EDIT, YOLO) control user confirmation.

2.3 Error handling

The system defines ToolErrorType enums (e.g., INVALID_TOOL_PARAMS, UNKNOWN, UNHANDLED_EXCEPTION, TOOL_NOT_REGISTERED, EXECUTION_FAILED) and specific file‑system and shell errors. Errors are wrapped in a standard createErrorResponse structure.

2.4 Timeout and cancellation

Both scheduling and execution respect an AbortSignal. Cancellation removes pending requests from the queue and aborts running processes, updating the tool call status to “cancelled”.

2.5 Output truncation

if (typeof content === 'string' && toolName === ShellTool.Name && this.config.getEnableToolOutputTruncation()) {
    const truncatedResult = await truncateAndSaveToFile(content, callId, ...);
    content = truncatedResult.content;
    outputFile = truncatedResult.outputFile;
}

This prevents excessive memory usage by cutting large outputs.

3. Shopify Sidekick tool‑management strategy

Shopify observed that tool count grows exponentially in complexity. They categorize three phases:

0‑20 tools : honeymoon period, clear responsibilities.

20‑50 tools : chaotic, overlapping behavior.

50+ tools : collapse, “Death by a Thousand Instructions”.

3.1 Just‑in‑Time (JIT) instructions

# Traditional: all instructions in system prompt
system_prompt = """
You are an AI assistant...
(500 lines of rules)
"""

# JIT: provide instructions only when needed
def get_tool_instructions(tool_name, context):
    if tool_name == "customer_query":
        if context.is_filtering:
            return "Use customer_tags for filtering..."
        else:
            return "Query basic customer info..."
"""

JIT keeps the prompt concise, improves caching, allows per‑tool A/B testing, and makes maintenance easier.

4. Summary

OpenManus showcases elegant architecture, Gemini CLI demonstrates a minimalist approach, and Shopify Sidekick highlights real‑world scaling challenges. Building production‑grade agents requires careful tool discovery, robust scheduling, precise error handling, and strategies like JIT instructions to keep the system maintainable and reliable.

AI agents LLM MCP ReAct Error Handling Just-in-Time Tool Management

Written by

Architecture and Beyond

Focused on AIGC SaaS technical architecture and tech team management, sharing insights on architecture, development efficiency, team leadership, startup technology choices, large‑scale website design, and high‑performance, highly‑available, scalable solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.