Tagged articles
9 articles
Page 1 of 1
Machine Heart
Machine Heart
Apr 10, 2026 · Artificial Intelligence

AdaGen: Enabling Adaptive, Data‑Driven Strategies for Image Generation Models

AdaGen replaces handcrafted static schedules in multi‑step image generators with a universal, learnable policy network trained via reinforcement learning, using an MDP formulation, adversarial rewards and action smoothing, achieving consistent quality and efficiency gains across diffusion, autoregressive, mask and flow models while adding negligible overhead.

MDPaction smoothingadaptive policy
0 likes · 11 min read
AdaGen: Enabling Adaptive, Data‑Driven Strategies for Image Generation Models
Data Party THU
Data Party THU
Oct 22, 2025 · Artificial Intelligence

Demystifying Large‑Model Reinforcement Learning: From MDP Basics to Bellman and Advantage Functions

This article provides a comprehensive introduction to reinforcement learning for large language models, covering the Markov Decision Process formulation, the four core elements of RL, state‑value and action‑value functions, Bellman equations, and the advantage function that underpins modern policy‑gradient algorithms.

AI fundamentalsBellman equationMDP
0 likes · 13 min read
Demystifying Large‑Model Reinforcement Learning: From MDP Basics to Bellman and Advantage Functions
Alimama Tech
Alimama Tech
Sep 7, 2022 · Artificial Intelligence

Curriculum-Guided Bayesian Reinforcement Learning for ROI-Constrained Real-Time Bidding

The paper presents a Curriculum‑Guided Bayesian Reinforcement Learning (CBRL) framework that models ROI‑constrained real‑time bidding as a partially observable constrained MDP, using hard‑margin indicator rewards and a curriculum of relaxed proxy problems to achieve fast, constraint‑satisfying, Bayes‑optimal policies that outperform existing methods on large‑scale industrial data.

Bayesian RLMDPROI constraint
0 likes · 15 min read
Curriculum-Guided Bayesian Reinforcement Learning for ROI-Constrained Real-Time Bidding
DaTaobao Tech
DaTaobao Tech
Aug 18, 2022 · Artificial Intelligence

Introduction to Deep Reinforcement Learning: Theory, Algorithms, and Applications

This article introduces deep reinforcement learning by explaining its Markov decision process foundations, then categorizes the main algorithm families—value‑based methods like DQN, policy‑based approaches such as PG/DPG/DDPG, and actor‑critic techniques including A3C, PPO, and DDPG—detailing their architectures, training procedures, and key advantages.

DQNMDPactor-critic
0 likes · 14 min read
Introduction to Deep Reinforcement Learning: Theory, Algorithms, and Applications
GuanYuan Data Tech Team
GuanYuan Data Tech Team
Jul 28, 2022 · Artificial Intelligence

Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications

This article introduces reinforcement learning by defining agents, environments, rewards, and policies, explains key concepts such as Markov Decision Processes and Bellman equations, and surveys major algorithms—including dynamic programming, Monte‑Carlo, TD learning, policy gradients, Q‑learning, DQN, and evolution strategies—while highlighting practical challenges and notable case studies like AlphaGo Zero.

Deep LearningEvolution StrategiesMDP
0 likes · 27 min read
Unlocking Reinforcement Learning: Core Concepts, Algorithms, and Real‑World Applications
360 Quality & Efficiency
360 Quality & Efficiency
Feb 14, 2020 · Artificial Intelligence

Applying Reinforcement Learning to UI Traversal for Automated Testing

The article explores how reinforcement learning can be used to create a test robot that performs UI traversal, discussing the challenges of full automation, defining the MDP components, feature extraction methods, reward design, and suitable RL algorithms to improve testing coverage and efficiency.

Automated TestingMDPSoftware Testing
0 likes · 8 min read
Applying Reinforcement Learning to UI Traversal for Automated Testing
Hulu Beijing
Hulu Beijing
Dec 5, 2017 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, describing its origins, key components such as agents, environments, states, actions, and rewards, explaining the Markov decision process framework, and highlighting common algorithms like Q‑learning, policy gradients, and actor‑critic methods.

AIAlgorithmsMDP
0 likes · 4 min read
What Is Reinforcement Learning? Core Concepts Explained
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 16, 2017 · Artificial Intelligence

How Reinforcement Learning Transforms E‑Commerce Search and Recommendation at Scale

This article explores how Alibaba's Taobao leverages reinforcement learning, Markov decision processes, and reward shaping to improve large‑scale product search ranking and recommendation, detailing problem modeling, algorithm designs such as Tabular Q‑learning and DDPG, experimental results, and advanced recommendation models like GBDT‑FTRL and Wide‑Deep.

Deep LearningMDPRecommendation Systems
0 likes · 21 min read
How Reinforcement Learning Transforms E‑Commerce Search and Recommendation at Scale