Tagged articles

OpenAI Gym

5 articles · Page 1 of 1

Nov 23, 2025 · Artificial Intelligence

Can a Drone Learn to Land Itself? A Deep Reinforcement Learning Walkthrough

This article walks through the fundamentals of reinforcement learning, builds a custom drone‑landing simulation, defines state and action spaces, designs reward functions, implements a neural‑network policy with Bernoulli sampling, and trains it using REINFORCE with baseline techniques, while exposing common pitfalls such as reward‑cheating.

OpenAI GymPythondrone landing

0 likes · 22 min read

Can a Drone Learn to Land Itself? A Deep Reinforcement Learning Walkthrough

Code DAO

Dec 3, 2021 · Artificial Intelligence

Understanding Actor‑Critic and A2C: From Policy Gradients to REINFORCE in RL

This article derives the policy‑gradient objective for discrete actions, implements the Monte‑Carlo REINFORCE algorithm in PyTorch, explains the actor‑critic framework, introduces Advantage Actor‑Critic (A2C) versus A3C, and demonstrates their performance on the OpenAI Gym CartPole‑v0 environment.

A2COpenAI GymPython

0 likes · 13 min read

Understanding Actor‑Critic and A2C: From Policy Gradients to REINFORCE in RL

Code DAO

Nov 28, 2021 · Artificial Intelligence

Adapting Soft Actor‑Critic for Discrete Action Spaces in Deep Reinforcement Learning

This article explains how to modify the Soft Actor‑Critic (SAC) algorithm—originally designed for continuous actions—to work with discrete action environments, presents the required changes to the actor and critic loss functions, provides a full PyTorch implementation, and evaluates the method on the CartPole‑v1 benchmark.

CartPoleDiscrete ActionsEntropy Regularization

0 likes · 20 min read

Adapting Soft Actor‑Critic for Discrete Action Spaces in Deep Reinforcement Learning

DataFunTalk

Nov 12, 2020 · Artificial Intelligence

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

This article explains how reinforcement learning, with its focus on maximizing long‑term reward, can improve recommendation system mixing by covering basic RL concepts, differences from supervised learning, multi‑armed bandit approaches, practical OpenAI Gym experiments, new AUC metrics, online gains, and advanced model optimizations.

Artificial IntelligenceOpenAI GymQ-Learning

0 likes · 10 min read

Reinforcement Learning for Recommendation System Mixing: Concepts, Practice, and Evaluation

Alibaba Cloud Developer

Mar 22, 2017 · Artificial Intelligence

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform

StarCraft, a classic real‑time strategy game, has become a premier testbed for deep reinforcement learning and AI research, and Alibaba’s open‑source Gym StarCraft platform now bridges Python, TensorFlow, Keras and OpenAI Gym to simplify multi‑agent, macro‑strategy development and fair benchmarking.

AlibabaOpenAI GymPython

0 likes · 3 min read

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform