How Do AI Agents Know When to Stop? Strategies and Real-World Implementations
This article explores the essential stop‑condition designs for AI agents, detailing hard limits, task‑completion checks, explicit termination tools, loop detection, error accumulation, and user interruption, and then examines concrete implementations in OpenManus and Gemini CLI with code examples and multi‑layer safeguards.
In simple terms, an AI Agent operates as a large loop that repeatedly obtains context, calls an LLM, and invokes tools.
Common Stop Strategies
AI Agent stop strategies fall into several categories:
1. Hard Limits
Maximum step count (e.g., 30 iterations)
Execution time limit (e.g., 5 minutes)
API call count limit (e.g., 100 calls)
API token usage limit
These limits are straightforward but can lead to poor user experience when tasks are cut off prematurely.
2. Task Completion Detection
# Ask LLM after each loop
response = llm.ask("Is the task completed?")
if response == "yes":
stop()3. Explicit Stop Signal
tools = [
"search",
"calculate",
"terminate" # dedicated stop tool
]When the agent calls the terminate tool, it stops. The prompt must teach the LLM when to invoke this tool.
4. Loop Detection
Repeated calls to the same tool
Repeated action sequences (A→B→A→B…)
Highly similar output content
5. Error Accumulation
if consecutive_errors > 3:
stop("Too many consecutive failures")6. User Interruption
Allow the user to abort the agent at any time.
OpenManus Stop Logic
OpenManus implements a multi‑layer protection scheme centered on a dedicated terminate tool.
Core: terminate tool
Each agent receives a terminate tool:
class Terminate(BaseTool):
name: str = "terminate"
description = "When the request is satisfied or cannot proceed, terminate the interaction."
async def execute(self, status: str) -> str:
return f"Interaction completed, status: {status}"The prompt tells the LLM to call terminate once the task is done.
State‑machine management
class AgentState(Enum):
IDLE = "idle"
RUNNING = "running"
FINISHED = "finished"
ERROR = "error"When the terminate tool is invoked, the state transitions to FINISHED and a log message is emitted.
Step limit
# ToolCallAgent: 30 steps
# SWEAgent: 20 steps
while self.current_step < self.max_steps and self.state != AgentState.FINISHED:
self.current_step += 1
await self.step()
if self.current_step >= self.max_steps:
results.append(f"Reached max steps ({self.max_steps})")This acts as a safety net to prevent infinite execution.
Stuck detection
def is_stuck(self) -> bool:
recent_messages = self.get_recent_assistant_messages()
if len(set(recent_messages)) == 1:
return True
return FalseGemini CLI Stop Logic
Gemini CLI uses a more elaborate approach.
Sub‑agent stop logic
1. Max turns (MAX_TURNS)
if (this.runConfig.max_turns && turnCounter >= this.runConfig.max_turns) {
this.output.terminate_reason = SubagentTerminateMode.MAX_TURNS;
break;
}2. Execution timeout
let durationMin = (Date.now() - startTime) / (1000 * 60);
if (durationMin >= this.runConfig.max_time_minutes) {
this.output.terminate_reason = SubagentTerminateMode.TIMEOUT;
break;
}The timeout is checked both before and after each LLM call.
3. User abort (AbortSignal)
if (abortController.signal.aborted) return;4. Error exception
catch (error) {
console.error('Error during subagent execution:', error);
this.output.terminate_reason = SubagentTerminateMode.ERROR;
throw error;
}5. Goal completion
Two cases are handled:
No predefined output requirements – if the LLM stops calling tools, the agent finishes.
Predefined output variables – the system checks whether all declared variables have been emitted; if not, a nudge message is sent to the LLM to emit the missing values.
// Nudge example
if (remainingVars.length > 0) {
const nudgeMessage = `You have stopped calling tools but have not emitted the following required variables: ${remainingVars.join(', ')}. Please use the 'self.emitvalue' tool now.`;
currentMessages = [{ role: 'user', parts: [{ text: nudgeMessage }] }];
}Three‑layer loop detection
First layer: tool‑call repetition
private checkToolCallLoop(toolCall): boolean {
const key = this.getToolCallKey(toolCall);
if (this.lastToolCallKey === key) {
this.toolCallRepetitionCount++;
} else {
this.lastToolCallKey = key;
this.toolCallRepetitionCount = 1;
}
return this.toolCallRepetitionCount >= TOOL_CALL_LOOP_THRESHOLD; // 5
}Second layer: content‑loop ("chanting") detection
private checkContentLoop(content: string): boolean {
if (this.inCodeBlock) return false;
this.streamContentHistory += content;
this.truncateAndUpdate();
return this.analyzeContentChunksForLoop();
}
private analyzeContentChunksForLoop(): boolean {
while (this.hasMoreChunksToProcess()) {
const currentChunk = this.streamContentHistory.substring(this.lastContentIndex, this.lastContentIndex + CONTENT_CHUNK_SIZE);
const chunkHash = createHash('sha256').update(currentChunk).digest('hex');
if (this.isLoopDetectedForChunk(currentChunk, chunkHash)) return true;
this.lastContentIndex++;
}
return false;
}
private isLoopDetectedForChunk(chunk: string, hash: string): boolean {
const existing = this.contentStats.get(hash);
if (!existing) { this.contentStats.set(hash, [this.lastContentIndex]); return false; }
if (!this.isActualContentMatch(chunk, existing[0])) return false;
existing.push(this.lastContentIndex);
if (existing.length < CONTENT_LOOP_THRESHOLD) return false; // need 10 repeats
const recent = existing.slice(-CONTENT_LOOP_THRESHOLD);
const totalDist = recent[recent.length - 1] - recent[0];
const avgDist = totalDist / (CONTENT_LOOP_THRESHOLD - 1);
const maxDist = CONTENT_CHUNK_SIZE * 1.5; // 75 chars
return avgDist <= maxDist;
}Third layer: LLM‑based loop detection
private async checkForLoopWithLLM(signal: AbortSignal) {
const recentHistory = this.config.getGeminiClient().getHistory().slice(-LLM_LOOP_CHECK_HISTORY_COUNT);
const trimmed = this.trimRecentHistory(recentHistory);
const result = await this.config.getBaseLlmClient().generateJson({
contents: [...trimmed, { role: 'user', parts: [{ text: taskPrompt }] }],
schema: { type: 'object', properties: { reasoning: { type: 'string' }, confidence: { type: 'number' } } },
model: DEFAULT_GEMINI_FLASH_MODEL,
systemInstruction: LOOP_DETECTION_SYSTEM_PROMPT
});
if (result['confidence'] > 0.9) {
console.warn(result['reasoning']);
return true;
}
return false;
}
async turnStarted(signal: AbortSignal) {
this.turnsInCurrentPrompt++;
if (this.turnsInCurrentPrompt >= LLM_CHECK_AFTER_TURNS &&
this.turnsInCurrentPrompt - this.lastCheckTurn >= this.llmCheckInterval) {
this.lastCheckTurn = this.turnsInCurrentPrompt;
return await this.checkForLoopWithLLM(signal);
}
}The LLM check starts after 30 turns and runs every few turns (default interval 3), with the interval dynamically adjusted based on confidence.
Loop type enumeration
enum LoopType {
CONSECUTIVE_IDENTICAL_TOOL_CALLS, // repeated tool calls
CHANTING_IDENTICAL_SENTENCES, // repeated output
LLM_DETECTED_LOOP // LLM‑detected logical loop
}Combining these mechanisms provides a robust multi‑layer defense against infinite execution while still allowing the agent to finish its intended work.
Conclusion
Stopping strategies are a critical yet often overlooked aspect of AI Agent design. A single mechanism is rarely sufficient; practical systems blend hard limits, LLM‑driven completion checks, explicit termination tools, loop detection, error thresholds, and user abort signals to achieve a balance between task fulfillment and resource safety.
OpenManus opts for a simple design: a terminate tool guided by the LLM, a state machine, and step limits as a fallback. Gemini CLI adopts a more sophisticated scheme with a declarative output system, Nudge reminders, and three layers of loop detection (tool‑call repetition, content‑hash analysis, and LLM‑based reasoning).
In practice, engineers should layer multiple stop conditions, provide clear feedback for each termination reason, and ensure reliable resource cleanup.
Architecture and Beyond
Focused on AIGC SaaS technical architecture and tech team management, sharing insights on architecture, development efficiency, team leadership, startup technology choices, large‑scale website design, and high‑performance, highly‑available, scalable solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
