Tagged articles

self‑critique

3 articles · Page 1 of 1

Jun 6, 2026 · Artificial Intelligence

When a Tool Call Ends: Observation Layer as the Quality‑Check Station for AI Agents

The article explains why AI agents often produce unreliable answers after a tool call finishes, identifies the missing observation layer as the quality‑check station, and details how to format, classify, grade errors, perform self‑critique, enforce a Definition of Done, and prevent loops or premature termination.

AI agentsError Handlingdefinition of done

0 likes · 22 min read

When a Tool Call Ends: Observation Layer as the Quality‑Check Station for AI Agents

PaperAgent

Jan 20, 2026 · Artificial Intelligence

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

Google DeepMind's new "Intrinsic Self‑Critique" method lets large language models iteratively self‑evaluate and rewrite their plans, raising Blocksworld planning accuracy from 49.8% to 89.3% and setting new records across multiple planning benchmarks.

AI researchLLMintrinsic evaluation

0 likes · 5 min read

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

Architect

Feb 20, 2025 · Artificial Intelligence

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

The article analyses recent breakthroughs such as OpenAI's o1, Long CoT, and test‑time search, arguing that enabling LLMs to perform self‑critique and reinforcement learning with long output sequences is essential for future AI performance, while warning against overly structured workflows.

AI researchIn‑Context RLLLM

0 likes · 12 min read

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

self‑critique

When a Tool Call Ends: Observation Layer as the Quality‑Check Station for AI Agents

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %​

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %