Tagged articles

MCTS

7 articles · Page 1 of 1

May 23, 2026 · Artificial Intelligence

Why Can’t LLMs Directly Copy AlphaGo’s MCTS Success?

The article analyzes why large language models cannot simply adopt AlphaGo’s Monte‑Carlo Tree Search, highlighting credit‑assignment difficulties, gradient‑variance explosion in multi‑step RL, and how AlphaGo’s tight integration of value and policy networks amortizes search in a way LLMs cannot replicate.

AlphaGoCredit AssignmentLLM

0 likes · 6 min read

Why Can’t LLMs Directly Copy AlphaGo’s MCTS Success?

Baobao Algorithm Notes

Oct 11, 2024 · Artificial Intelligence

How Does OpenAI’s o1 Achieve Self‑Correction? A Deep Dive into MCTS and SCoRe

Examining OpenAI’s o1 model, this article explores its self‑correction capability by linking test‑time scaling, MCTS‑style reasoning, and DeepMind’s SCoRe reinforcement‑learning framework, illustrating step‑by‑step demos, hypothesizing internal judgment mechanisms, and proposing training pipelines that combine self‑generated data with post‑training RL.

LLM reasoningMCTSOpenAI

0 likes · 12 min read

How Does OpenAI’s o1 Achieve Self‑Correction? A Deep Dive into MCTS and SCoRe

Baobao Algorithm Notes

Oct 10, 2024 · Artificial Intelligence

How MCTS Powers Inference in OpenAI’s o1: A Deep Dive with rStar

This article explains how the inference component of OpenAI’s o1 model can be implemented using Monte‑Carlo Tree Search, detailing the action space, rollout process, UCT scoring, and best‑path selection, with a concrete walkthrough of Microsoft’s open‑source rStar code.

Large Language ModelsMCTSOpenAI o1

0 likes · 26 min read

How MCTS Powers Inference in OpenAI’s o1: A Deep Dive with rStar

Model Perspective

Jul 31, 2024 · Artificial Intelligence

How Monte Carlo Tree Search Powers AlphaGo and Beyond: A Deep Dive

Monte Carlo Tree Search (MCTS) is a statistical heuristic algorithm that builds decision trees through selection, expansion, simulation, and backpropagation, enabling breakthroughs like AlphaGo’s victory and finding applications in robotics, autonomous driving, finance, and bioinformatics.

AI ApplicationsAlphaGoMCTS

0 likes · 7 min read

How Monte Carlo Tree Search Powers AlphaGo and Beyond: A Deep Dive

Baobao Algorithm Notes

Jul 9, 2024 · Artificial Intelligence

Why Step-Level DPO Is Revolutionizing LLM Math Reasoning

This article reviews recent step‑level DPO research, compares it with instance‑level DPO, explains the underlying Monte Carlo Tree Search formulation, and presents the author’s own replication experiments that demonstrate consistent performance gains across multiple LLM sizes on GSM8K and MATH benchmarks.

AI researchLLM alignmentMCTS

0 likes · 10 min read

Why Step-Level DPO Is Revolutionizing LLM Math Reasoning

Tencent Cloud Developer

Jun 27, 2018 · Artificial Intelligence

Search and Optimization Algorithms in Game AI

Game AI relies on a variety of search techniques—ranging from uninformed breadth‑first and depth‑first methods to heuristic‑driven A*, minimax with alpha‑beta pruning, and Monte Carlo Tree Search—as well as optimization approaches such as hill climbing, simulated annealing, genetic and evolution strategies, multi‑objective evolutionary algorithms, and neuroevolutionary methods like NEAT to generate intelligent, balanced, and adaptable game behavior.

A* algorithmMCTSMiniMax

0 likes · 20 min read

Search and Optimization Algorithms in Game AI

Architect

Mar 10, 2016 · Artificial Intelligence

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

This article explains Monte Carlo Tree Search (MCTS), covering its origin in AlphaGo, fundamental algorithm steps, node‑selection strategies such as UCB, strengths and weaknesses, enhancements, historical background, and recent research developments in artificial intelligence.

Artificial IntelligenceMCTSMonte Carlo Tree Search

0 likes · 12 min read