DataFun Decision Intelligence Summit – Reinforcement Learning Forum Overview
The DataFun Decision Intelligence Summit brings together leading researchers and industry experts to present cutting‑edge reinforcement learning algorithms, safety considerations, distributional methods, and real‑world applications such as vehicle routing, recommender systems, and power‑grid scheduling, highlighting future directions and audience takeaways.
Reinforcement learning (RL) is a machine‑learning paradigm where agents learn optimal behaviors through interaction with environments, yet it still faces challenges in scalability, reward design, safety, robustness, and generalization.
Future RL research aims to explore higher‑level agent designs such as modular, hierarchical, and meta‑learning approaches, while expanding into domains like the internet, education, entertainment, and energy.
The DataFun Decision Intelligence Summit’s RL Forum features five keynote talks that showcase frontier RL algorithms and their deployment in fields such as operations optimization, recommendation systems, power‑grid scheduling, distributed RL, and safety‑aware RL.
Speaker: Qin Zhiwei (Lyft, Principal Scientist) – Ph.D. in Operations Research from Columbia University, with extensive work at Lyft, Didi, and AI research, focusing on RL for intelligent transportation, supply‑chain optimization, and online marketing.
Speaker: Xiaocheng Tang (Meta AI, Research Scientist) – Former senior staff researcher at Didi AI Labs; presents “GreedRL – DRL Solver”, covering the development history, application scenarios, architecture, and solutions for large‑scale vehicle‑routing, online allocation, and 3D packing problems. Audience will learn how to solve massive VRP in real time, apply DRL to online allocation, and use DRL for intelligent decision making.
Speaker: Zhu Zhiqing (Meta AI, Application RL Lead) – Leads Meta’s application RL team; presents “Reinforcement Learning for Recommender Systems”. The talk discusses integrating RL with auction‑based recommendation, framing exploration as a bandit problem, and leveraging deep exploration for sequential decision making. Attendees will learn deployment strategies, scaling exploration, and incorporating deep exploration into recommender pipelines.
Speaker: Yang Chao (Alibaba DAMO Academy, Senior Algorithm Expert) – Expert in deep learning, RL, and numerical computing; presents “Safe Reinforcement Learning and Its Application to Power‑Grid Scheduling”. The session introduces SafeRL concepts, methods, and their use in power‑grid dispatch, along with a dual‑decision engine platform. Audience gains insight into safety‑aware RL and its practical deployment in energy systems.
Speaker: Zhou Fan (Shanghai University of Finance and Economics, Associate Professor) – Researches RL, deep learning, and causal inference; presents “Recent Advances in Distributional Reinforcement Learning”. The talk examines quantile crossing issues, improves exploration efficiency, and reduces variance in Q‑function estimation. Participants will understand distributional RL challenges and solutions.
Speaker: Yuyan Wang (Stanford Graduate School of Business, Assistant Professor of Marketing) – Works on long‑term user experience in recommender systems; presents “Surrogate for Long‑Term User Experience in Recommender Systems”. The talk proposes surrogate short‑term signals predictive of long‑term visits, validates them via large‑scale experiments, and demonstrates RL‑based improvements. Attendees will learn why short‑term metrics can be misleading and how to design effective surrogates for long‑term outcomes.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
