How Do AI Agents Know When to Stop? Strategies and Real-World Implementations

This article explores the essential stop‑condition designs for AI agents, detailing hard limits, task‑completion checks, explicit termination tools, loop detection, error accumulation, and user interruption, and then examines concrete implementations in OpenManus and Gemini CLI with code examples and multi‑layer safeguards.

Architecture and Beyond
Architecture and Beyond
Architecture and Beyond
How Do AI Agents Know When to Stop? Strategies and Real-World Implementations

In simple terms, an AI Agent operates as a large loop that repeatedly obtains context, calls an LLM, and invokes tools.

Common Stop Strategies

AI Agent stop strategies fall into several categories:

1. Hard Limits

Maximum step count (e.g., 30 iterations)

Execution time limit (e.g., 5 minutes)

API call count limit (e.g., 100 calls)

API token usage limit

These limits are straightforward but can lead to poor user experience when tasks are cut off prematurely.

2. Task Completion Detection

# Ask LLM after each loop
response = llm.ask("Is the task completed?")
if response == "yes":
    stop()

3. Explicit Stop Signal

tools = [
    "search",
    "calculate",
    "terminate"  # dedicated stop tool
]

When the agent calls the terminate tool, it stops. The prompt must teach the LLM when to invoke this tool.

4. Loop Detection

Repeated calls to the same tool

Repeated action sequences (A→B→A→B…)

Highly similar output content

5. Error Accumulation

if consecutive_errors > 3:
    stop("Too many consecutive failures")

6. User Interruption

Allow the user to abort the agent at any time.

OpenManus Stop Logic

OpenManus implements a multi‑layer protection scheme centered on a dedicated terminate tool.

Core: terminate tool

Each agent receives a terminate tool:

class Terminate(BaseTool):
    name: str = "terminate"
    description = "When the request is satisfied or cannot proceed, terminate the interaction."
    async def execute(self, status: str) -> str:
        return f"Interaction completed, status: {status}"

The prompt tells the LLM to call terminate once the task is done.

State‑machine management

class AgentState(Enum):
    IDLE = "idle"
    RUNNING = "running"
    FINISHED = "finished"
    ERROR = "error"

When the terminate tool is invoked, the state transitions to FINISHED and a log message is emitted.

Step limit

# ToolCallAgent: 30 steps
# SWEAgent: 20 steps
while self.current_step < self.max_steps and self.state != AgentState.FINISHED:
    self.current_step += 1
    await self.step()
if self.current_step >= self.max_steps:
    results.append(f"Reached max steps ({self.max_steps})")

This acts as a safety net to prevent infinite execution.

Stuck detection

def is_stuck(self) -> bool:
    recent_messages = self.get_recent_assistant_messages()
    if len(set(recent_messages)) == 1:
        return True
    return False

Gemini CLI Stop Logic

Gemini CLI uses a more elaborate approach.

Sub‑agent stop logic

1. Max turns (MAX_TURNS)

if (this.runConfig.max_turns && turnCounter >= this.runConfig.max_turns) {
    this.output.terminate_reason = SubagentTerminateMode.MAX_TURNS;
    break;
}

2. Execution timeout

let durationMin = (Date.now() - startTime) / (1000 * 60);
if (durationMin >= this.runConfig.max_time_minutes) {
    this.output.terminate_reason = SubagentTerminateMode.TIMEOUT;
    break;
}

The timeout is checked both before and after each LLM call.

3. User abort (AbortSignal)

if (abortController.signal.aborted) return;

4. Error exception

catch (error) {
    console.error('Error during subagent execution:', error);
    this.output.terminate_reason = SubagentTerminateMode.ERROR;
    throw error;
}

5. Goal completion

Two cases are handled:

No predefined output requirements – if the LLM stops calling tools, the agent finishes.

Predefined output variables – the system checks whether all declared variables have been emitted; if not, a nudge message is sent to the LLM to emit the missing values.

// Nudge example
if (remainingVars.length > 0) {
    const nudgeMessage = `You have stopped calling tools but have not emitted the following required variables: ${remainingVars.join(', ')}. Please use the 'self.emitvalue' tool now.`;
    currentMessages = [{ role: 'user', parts: [{ text: nudgeMessage }] }];
}

Three‑layer loop detection

First layer: tool‑call repetition

private checkToolCallLoop(toolCall): boolean {
    const key = this.getToolCallKey(toolCall);
    if (this.lastToolCallKey === key) {
        this.toolCallRepetitionCount++;
    } else {
        this.lastToolCallKey = key;
        this.toolCallRepetitionCount = 1;
    }
    return this.toolCallRepetitionCount >= TOOL_CALL_LOOP_THRESHOLD; // 5
}

Second layer: content‑loop ("chanting") detection

private checkContentLoop(content: string): boolean {
    if (this.inCodeBlock) return false;
    this.streamContentHistory += content;
    this.truncateAndUpdate();
    return this.analyzeContentChunksForLoop();
}

private analyzeContentChunksForLoop(): boolean {
    while (this.hasMoreChunksToProcess()) {
        const currentChunk = this.streamContentHistory.substring(this.lastContentIndex, this.lastContentIndex + CONTENT_CHUNK_SIZE);
        const chunkHash = createHash('sha256').update(currentChunk).digest('hex');
        if (this.isLoopDetectedForChunk(currentChunk, chunkHash)) return true;
        this.lastContentIndex++;
    }
    return false;
}

private isLoopDetectedForChunk(chunk: string, hash: string): boolean {
    const existing = this.contentStats.get(hash);
    if (!existing) { this.contentStats.set(hash, [this.lastContentIndex]); return false; }
    if (!this.isActualContentMatch(chunk, existing[0])) return false;
    existing.push(this.lastContentIndex);
    if (existing.length < CONTENT_LOOP_THRESHOLD) return false; // need 10 repeats
    const recent = existing.slice(-CONTENT_LOOP_THRESHOLD);
    const totalDist = recent[recent.length - 1] - recent[0];
    const avgDist = totalDist / (CONTENT_LOOP_THRESHOLD - 1);
    const maxDist = CONTENT_CHUNK_SIZE * 1.5; // 75 chars
    return avgDist <= maxDist;
}

Third layer: LLM‑based loop detection

private async checkForLoopWithLLM(signal: AbortSignal) {
    const recentHistory = this.config.getGeminiClient().getHistory().slice(-LLM_LOOP_CHECK_HISTORY_COUNT);
    const trimmed = this.trimRecentHistory(recentHistory);
    const result = await this.config.getBaseLlmClient().generateJson({
        contents: [...trimmed, { role: 'user', parts: [{ text: taskPrompt }] }],
        schema: { type: 'object', properties: { reasoning: { type: 'string' }, confidence: { type: 'number' } } },
        model: DEFAULT_GEMINI_FLASH_MODEL,
        systemInstruction: LOOP_DETECTION_SYSTEM_PROMPT
    });
    if (result['confidence'] > 0.9) {
        console.warn(result['reasoning']);
        return true;
    }
    return false;
}

async turnStarted(signal: AbortSignal) {
    this.turnsInCurrentPrompt++;
    if (this.turnsInCurrentPrompt >= LLM_CHECK_AFTER_TURNS &&
        this.turnsInCurrentPrompt - this.lastCheckTurn >= this.llmCheckInterval) {
        this.lastCheckTurn = this.turnsInCurrentPrompt;
        return await this.checkForLoopWithLLM(signal);
    }
}

The LLM check starts after 30 turns and runs every few turns (default interval 3), with the interval dynamically adjusted based on confidence.

Loop type enumeration

enum LoopType {
    CONSECUTIVE_IDENTICAL_TOOL_CALLS, // repeated tool calls
    CHANTING_IDENTICAL_SENTENCES,      // repeated output
    LLM_DETECTED_LOOP                 // LLM‑detected logical loop
}

Combining these mechanisms provides a robust multi‑layer defense against infinite execution while still allowing the agent to finish its intended work.

Conclusion

Stopping strategies are a critical yet often overlooked aspect of AI Agent design. A single mechanism is rarely sufficient; practical systems blend hard limits, LLM‑driven completion checks, explicit termination tools, loop detection, error thresholds, and user abort signals to achieve a balance between task fulfillment and resource safety.

OpenManus opts for a simple design: a terminate tool guided by the LLM, a state machine, and step limits as a fallback. Gemini CLI adopts a more sophisticated scheme with a declarative output system, Nudge reminders, and three layers of loop detection (tool‑call repetition, content‑hash analysis, and LLM‑based reasoning).

In practice, engineers should layer multiple stop conditions, provide clear feedback for each termination reason, and ensure reliable resource cleanup.

LLMOpenManusAI AgentLoop DetectionGemini CLIstop strategyTool Invocation
Architecture and Beyond
Written by

Architecture and Beyond

Focused on AIGC SaaS technical architecture and tech team management, sharing insights on architecture, development efficiency, team leadership, startup technology choices, large‑scale website design, and high‑performance, highly‑available, scalable solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.