Tagged articles
9 articles
Page 1 of 1
DeepHub IMBA
DeepHub IMBA
May 8, 2026 · Artificial Intelligence

Building a Custom 8×8 GridWorld with Q‑Learning in Gymnasium

This tutorial walks through creating a custom 8×8 GridWorld environment in Gymnasium, implementing a Q‑Learning agent that learns to navigate from the top‑left corner to the bottom‑right goal while avoiding walls, and visualizing training curves, learned policies, and a performance comparison with a random agent.

GridWorldGymnasiumPython
0 likes · 10 min read
Building a Custom 8×8 GridWorld with Q‑Learning in Gymnasium
Didi Tech
Didi Tech
Aug 28, 2025 · Artificial Intelligence

Why Temporal Difference Beats Monte Carlo: Mastering the Bellman Equation

Explore how the Bellman equation underpins reinforcement learning, comparing Dynamic Programming, Monte Carlo, and Temporal‑Difference methods, and discover why TD’s low‑variance, online updates make it a powerful bridge between model‑based planning and sample‑based learning.

Bellman equationMonte CarloQ-Learning
0 likes · 21 min read
Why Temporal Difference Beats Monte Carlo: Mastering the Bellman Equation
AI Algorithm Path
AI Algorithm Path
May 24, 2025 · Artificial Intelligence

How N-step Temporal-Difference Methods Extend TD Learning in Reinforcement AI

This tutorial explains how n-step temporal‑difference (TD) algorithms generalize the one‑step TD and Monte‑Carlo methods, presents the n‑step return update rule, walks through a three‑step TD example, shows how Sarsa and Q‑learning can be extended, and discusses how to choose the optimal n value for a given problem.

Monte CarloQ-Learningalgorithm analysis
0 likes · 9 min read
How N-step Temporal-Difference Methods Extend TD Learning in Reinforcement AI
AI Algorithm Path
AI Algorithm Path
May 23, 2025 · Artificial Intelligence

Understanding Temporal‑Difference Algorithms in Reinforcement Learning

This tutorial explains temporal‑difference (TD) learning, compares it with dynamic programming and Monte‑Carlo methods, walks through concrete soccer‑match examples, shows one‑step TD versus constant‑α Monte‑Carlo updates, discusses convergence, bias, and introduces popular TD variants such as Sarsa, Q‑learning, Expected Sarsa and double learning.

Monte CarloQ-LearningTD learning
0 likes · 18 min read
Understanding Temporal‑Difference Algorithms in Reinforcement Learning
Model Perspective
Model Perspective
Dec 28, 2022 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts and Key Algorithms Explained

This article introduces reinforcement learning, compares it with supervised and unsupervised learning, explains its components and Markov Decision Processes, and reviews fundamental model‑free and model‑based algorithms such as Q‑Learning, SARSA, TD learning, and exploration strategies.

Markov Decision ProcessQ-Learningsarsa
0 likes · 16 min read
What Is Reinforcement Learning? Core Concepts and Key Algorithms Explained
GuanYuan Data Tech Team
GuanYuan Data Tech Team
Jul 28, 2022 · Artificial Intelligence

Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications

This article introduces reinforcement learning by defining agents, environments, rewards, and policies, explains key concepts such as Markov Decision Processes and Bellman equations, and surveys major algorithms—including dynamic programming, Monte‑Carlo, TD learning, policy gradients, Q‑learning, DQN, and evolution strategies—while highlighting practical challenges and notable case studies like AlphaGo Zero.

Deep LearningEvolution StrategiesMDP
0 likes · 27 min read
Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications
DataFunTalk
DataFunTalk
Nov 12, 2020 · Artificial Intelligence

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

This article explains how reinforcement learning, with its focus on maximizing long‑term reward, can improve recommendation system mixing by covering basic RL concepts, differences from supervised learning, multi‑armed bandit approaches, practical OpenAI Gym experiments, new AUC metrics, online gains, and advanced model optimizations.

OpenAI GymQ-LearningRecommendation Systems
0 likes · 10 min read
Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation
Aotu Lab
Aotu Lab
Jul 22, 2020 · Frontend Development

How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer

This article explains how reinforcement‑learning (Q‑learning) can generate mock interface data for regression testing, how Puppeteer automates UI interactions, and how a DSL‑plus‑runtime approach enables scalable pop‑up components, reducing testing costs in complex e‑commerce interactions.

AutomationPuppeteerQ-Learning
0 likes · 8 min read
How Q‑Learning Can Power Smart UI Testing and Scalable Pop‑ups with Puppeteer
Hulu Beijing
Hulu Beijing
Dec 6, 2017 · Artificial Intelligence

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

This article explains how deep reinforcement learning, built upon traditional Q‑learning and enhanced with techniques like experience replay, enables agents to play Atari video games directly from raw pixel inputs, illustrating the key differences, processing steps, and the significance of this breakthrough in AI.

AtariQ-Learningdeep Q‑learning
0 likes · 5 min read
How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery