PaperAgent
PaperAgent
Jan 20, 2026 · Artificial Intelligence

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %​

Google DeepMind's new "Intrinsic Self‑Critique" method lets large language models iteratively self‑evaluate and rewrite their plans, raising Blocksworld planning accuracy from 49.8% to 89.3% and setting new records across multiple planning benchmarks.

AI researchLLMPlanning
0 likes · 5 min read
How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %​
Architect
Architect
Feb 20, 2025 · Artificial Intelligence

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

The article analyses recent breakthroughs such as OpenAI's o1, Long CoT, and test‑time search, arguing that enabling LLMs to perform self‑critique and reinforcement learning with long output sequences is essential for future AI performance, while warning against overly structured workflows.

AI researchIn‑Context RLLLM
0 likes · 12 min read
Why Long CoT and In‑Context RL Are the Next Frontier for LLMs