Tagged articles

Value Function

3 articles · Page 1 of 1

Oct 22, 2025 · Artificial Intelligence

Demystifying Large‑Model Reinforcement Learning: From MDP Basics to Bellman and Advantage Functions

This article provides a comprehensive introduction to reinforcement learning for large language models, covering the Markov Decision Process formulation, the four core elements of RL, state‑value and action‑value functions, Bellman equations, and the advantage function that underpins modern policy‑gradient algorithms.

AI fundamentalsBellman equationLarge Language Model

0 likes · 13 min read

Demystifying Large‑Model Reinforcement Learning: From MDP Basics to Bellman and Advantage Functions

AI Algorithm Path

May 21, 2025 · Artificial Intelligence

Understanding Monte Carlo Algorithms for Reinforcement Learning with a Blackjack Case Study

This article explains Monte Carlo methods for reinforcement learning, compares model‑free and model‑based approaches, details V‑ and Q‑function estimation using a Blackjack example, and discusses exploration‑exploitation trade‑offs and practical advantages of MC algorithms.

BlackjackModel-freeMonte Carlo

0 likes · 13 min read

Understanding Monte Carlo Algorithms for Reinforcement Learning with a Blackjack Case Study

AI Algorithm Path

May 18, 2025 · Artificial Intelligence

Reinforcement Learning Tutorial Part 1: Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, covering the agent‑environment interaction, key terminology, reward structures, task types, policies, value functions, the Bellman equations, and how optimal strategies are derived and approximated in practice.

Bellman equationMarkov Decision ProcessOptimal Policy

0 likes · 13 min read

Reinforcement Learning Tutorial Part 1: Core Concepts Explained