Tagged articles

reinforcement learning

743 articles · Page 8 of 8
Didi Tech
Didi Tech
Sep 13, 2019 · Artificial Intelligence

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Didi’s ride‑hailing dispatch system has progressed from a simple greedy, first‑come‑first‑served matcher to sophisticated batch, chain, and predictive algorithms that use deep‑learning demand forecasts and reinforcement‑learning optimization to assign drivers under complex business rules, boosting response rates and serving over 30 million daily requests.

AIOptimizationRide Hailing
0 likes · 17 min read
Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 28, 2019 · Artificial Intelligence

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

This article introduces the Exact‑K recommendation problem, highlights its differences from traditional Top‑K approaches, and presents a novel solution combining Graph Attention Networks (GAttN) with Reinforcement Learning from Demonstrations (RLfD), backed by extensive experiments showing superior performance on real-world datasets.

exact-kgraph attention networksmachine learning
0 likes · 14 min read
Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained
Tencent Cloud Developer
Tencent Cloud Developer
Aug 14, 2019 · Artificial Intelligence

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

From Steve Jobs’s early work at Atari to modern DeepMind breakthroughs, the article traces how video games have grown into a multibillion‑dollar industry that serves as a testbed for AI research, while highlighting current AI techniques for smarter agents, procedural content generation, and the collaborative challenges shaping the future of game development.

Game DevelopmentMonte Carlo Tree Searchartificial-intelligence
0 likes · 25 min read
From Atari to AI: The Evolution of Video Games and Artificial Intelligence
DataFunTalk
DataFunTalk
Jul 31, 2019 · Artificial Intelligence

Key Characteristics and Practical Improvements of Recommendation Technologies

This article discusses the fundamental traits of recommendation technologies, compares UserCF and ItemCF models, explains matrix factorization and FM, explores negative sampling, CTR/CVR modeling, ensemble methods, and practical considerations such as reinforcement learning and exploration strategies for improving recommendation performance in real-world systems.

matrix factorizationreinforcement learning
0 likes · 11 min read
Key Characteristics and Practical Improvements of Recommendation Technologies
AntTech
AntTech
Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIInformation RetrievalKnowledge Graph
0 likes · 5 min read
Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 12, 2019 · Artificial Intelligence

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

RESA is a real‑time evaluation platform that continuously tests multiple Adaptive Bitrate (ABR) algorithms on live user traffic, introduces a multi‑user QoE metric derived from viewing behavior, reveals trade‑offs between clarity and bandwidth, and proposes the RL‑based ABSbc algorithm to steer bitrate distribution and balance user experience with network cost.

ABRBandwidth ControlEvaluation
0 likes · 23 min read
Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 27, 2019 · Artificial Intelligence

Generating Personalized E‑commerce Review Replies with Product Information

This paper presents a sequence‑to‑sequence model that fuses product‑detail tables with customer comments, using gated multimodal attention, copy mechanisms and reinforcement learning to automatically produce high‑quality, context‑aware replies for e‑commerce platforms, and validates the approach with extensive experiments on a large Taobao dataset.

Sequence-to-Sequencecopy mechanisme‑commerce
0 likes · 21 min read
Generating Personalized E‑commerce Review Replies with Product Information
Ctrip Technology
Ctrip Technology
Jun 19, 2019 · Artificial Intelligence

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

This article examines the limitations of traditional learning‑to‑rank for Ctrip hotel sorting, introduces reinforcement learning as a remedy, outlines three progressive implementation plans (A, B, C) with algorithm choices and engineering trade‑offs, and presents early experimental findings that demonstrate RL's potential to improve conversion rates.

CtripRLRanking
0 likes · 15 min read
Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results
AntTech
AntTech
Jun 10, 2019 · Artificial Intelligence

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

This article presents a model‑based reinforcement learning framework for recommendation systems that uses a generative adversarial user model to simultaneously learn user behavior dynamics and reward functions, enabling efficient Cascading‑DQN policy learning and achieving superior long‑term user rewards and click‑through rates in experiments.

Cascading DQNGenerative Adversarial NetworksUser Modeling
0 likes · 9 min read
Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 1, 2019 · Fundamentals

Must-Read Technical Books Recommended by Alibaba Experts

Alibaba’s senior engineers share their curated list of essential technical books—from software testing and design patterns to AI, machine learning, reinforcement learning, Rust programming, and database architecture—explaining why each title is valuable for developers seeking deeper knowledge and practical insights.

AIdatabase systemsdesign patterns
0 likes · 9 min read
Must-Read Technical Books Recommended by Alibaba Experts
DataFunTalk
DataFunTalk
Mar 8, 2019 · Artificial Intelligence

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

The article presents an in‑depth overview of Alibaba's intelligent service bot Ali Xiaomì, covering its platform evolution, core NLP techniques such as intent recognition and machine reading comprehension, multi‑turn recommendation strategies, transfer‑learning approaches across domains and languages, and future technical challenges.

AImachine reading comprehensionnatural language processing
0 likes · 11 min read
Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning
Tencent Cloud Developer
Tencent Cloud Developer
Jan 17, 2019 · Artificial Intelligence

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Tencent’s industrial practice shows how a large‑scale offline‑nearline‑online “Shield” recommendation architecture, powered by the DeepR framework built on RCaffe, uses deep semantic embeddings, massive neural networks and reinforcement‑learning decisions to handle billions of daily requests, demonstrating that data richness and engineering capability, not model depth alone, drive performance in big‑data recommendation systems.

Big DataNeural NetworkRCaffe
0 likes · 13 min read
Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 15, 2019 · Artificial Intelligence

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

This article details Alibaba.com engineers' application of reinforcement learning, attention mechanisms, and weakly supervised techniques to extract product summaries, improve content quality, and significantly raise SEO rankings, supported by offline experiments, online A/B testing, and future research directions.

AlibabaSEOattention model
0 likes · 16 min read
How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models
DataFunTalk
DataFunTalk
Jan 9, 2019 · Artificial Intelligence

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

This article introduces reinforcement learning fundamentals, contrasts it with supervised learning, and explores its challenges and advantages in natural language processing, including applications such as text classification, relation extraction from noisy data, and weakly supervised topic segmentation, while summarizing key insights and experimental results.

Text ClassificationWeak Supervisionnatural language processing
0 likes · 11 min read
Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 20, 2018 · Artificial Intelligence

How Reinforcement Learning Powers Interactive Search in E‑Commerce

This article explains how reinforcement learning can be modeled and deployed to enable intelligent, interactive product search on e‑commerce platforms, detailing problem definition, system architecture, training methodology, online results, and future research directions.

deep learningdialogue systeme-commerce
0 likes · 17 min read
How Reinforcement Learning Powers Interactive Search in E‑Commerce
iQIYI Technical Product Team
iQIYI Technical Product Team
Nov 16, 2018 · Artificial Intelligence

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

This article explains the principles of adaptive bitrate streaming, compares traditional ABR algorithms with a reinforcement‑learning‑based approach, describes its system architecture and training process, and presents QoS evaluation results that show RL‑driven streaming can improve video quality and smoothness.

ABR algorithmsAIQoS evaluation
0 likes · 8 min read
How Reinforcement Learning Transforms Adaptive Bitrate Streaming
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 16, 2018 · Artificial Intelligence

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

This article traces the ten‑year evolution of Alibaba’s e‑commerce search system, detailing four major stages—from the early Pora streaming engine to dual‑link real‑time architectures, the integration of deep and reinforcement learning, and the shift to large‑scale online deep learning—while highlighting the technical drivers and future AI‑enabled search vision.

Searche-commercemachine learning
0 likes · 16 min read
How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI
Meituan Technology Team
Meituan Technology Team
Nov 15, 2018 · Artificial Intelligence

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Meituan enhanced its homepage “Guess You Like” recommendation slot by modeling user‑item interactions as a Markov Decision Process and applying an improved DDPG reinforcement‑learning agent that adjusts the ranking trade‑off parameter, uses advantage‑based Q decomposition, shares actor‑critic weights, and runs in a real‑time TensorFlow pipeline, delivering consistent lifts in click‑through, dwell time, and depth.

DDPGMDP ModelingTensorFlow
0 likes · 21 min read
Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking
Tencent Cloud Developer
Tencent Cloud Developer
Oct 18, 2018 · Artificial Intelligence

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

This article outlines the essential role of a data scientist and introduces ten fundamental machine‑learning algorithms—including PCA/SVD, OLS and polynomial regression, regularized linear models, K‑Means, logistic regression, SVM, feed‑forward, convolutional and recurrent neural networks, CRFs, ensemble trees, and reinforcement‑learning methods—while linking to popular Python libraries and tutorials.

Decision TreesPCAalgorithms
0 likes · 10 min read
10 Machine Learning Algorithms You Should Know to Become a Data Scientist
Sohu Tech Products
Sohu Tech Products
Oct 10, 2018 · Artificial Intelligence

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

This article explains how reinforcement learning, specifically the DDPG algorithm combined with Transformer-based networks, is applied to improve large‑scale news recall systems, detailing the business scenario, algorithm selection, model architecture, speed optimizations, training challenges, and observed online performance gains.

AIDDPGOnline Advertising
0 likes · 13 min read
Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture
DataFunTalk
DataFunTalk
Sep 27, 2018 · Artificial Intelligence

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

The article presents a comprehensive overview of how Shumei Technology leverages machine learning—including supervised, unsupervised, and reinforcement learning methods—across its credit scoring, fraud detection, advertising, and audio content moderation services, highlighting practical challenges, model fusion techniques, and future research directions.

Model Fusionreinforcement learning
0 likes · 12 min read
Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases
JD Tech
JD Tech
Sep 12, 2018 · Artificial Intelligence

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

The article details JD's third‑generation autonomous delivery robots, covering their multi‑sensor fusion localization, deep‑learning perception, reinforcement‑learning motion control, extensive patent portfolio, and upcoming technical hurdles such as high‑precision mapping and lidar cost, while also inviting public voting for patent awards.

AI navigationJD Logisticsautonomous robots
0 likes · 8 min read
JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2018 · Artificial Intelligence

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

This article analyzes OpenAI’s Retro Contest on Sonic the Hedgehog, explains why reinforcement learning generalization is crucial for AGI, and details the winning team’s joint PPO pipeline, engineering optimizations, training strategies, and final performance compared to human baselines.

OpenAI Retro ContestRL generalizationSonic game
0 likes · 21 min read
How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 23, 2018 · Fundamentals

Top Technical Books Recommended by Alibaba Experts for World Book Day

On World Book Day, nine Alibaba technology veterans share a curated list of essential technical books—covering software testing, design patterns, AI, machine learning, reinforcement learning, Rust, and database architecture—offering concise reasons why each title is valuable for developers and engineers.

Database ArchitectureRust programmingdesign patterns
0 likes · 10 min read
Top Technical Books Recommended by Alibaba Experts for World Book Day
Tencent Cloud Developer
Tencent Cloud Developer
Mar 15, 2018 · Artificial Intelligence

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

The article surveys three recent approaches—Transition State Clustering, Sequential Windowed Inverse Reinforcement Learning, and Deep Discovery of Continuous Options—that automatically segment long‑horizon surgical‑robot demonstrations into sub‑tasks, learn hierarchical policies from limited data, and achieve markedly higher success rates on da Vinci cutting, tension, and needle‑picking tasks.

hierarchical learningimitation learningreinforcement learning
0 likes · 18 min read
Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 5, 2018 · Artificial Intelligence

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

In 2017 Alibaba's AliMe chatbot platform expanded from a single‑company solution to a multilingual, multi‑channel AI service, introducing platform‑level SaaS/PaaS capabilities, a seven‑layer front‑end architecture, modular back‑end design, advanced intent recognition, knowledge‑graph‑driven product management, reinforcement‑learning‑based recommendation, and machine‑reading comprehension for enterprise and consumer use cases.

AI platformAlibabaChatbot
0 likes · 23 min read
How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact
Hulu Beijing
Hulu Beijing
Dec 6, 2017 · Artificial Intelligence

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

This article explains how deep reinforcement learning, built upon traditional Q‑learning and enhanced with techniques like experience replay, enables agents to play Atari video games directly from raw pixel inputs, illustrating the key differences, processing steps, and the significance of this breakthrough in AI.

AtariQ-Learningdeep Q‑learning
0 likes · 5 min read
How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery
Hulu Beijing
Hulu Beijing
Dec 5, 2017 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, describing its origins, key components such as agents, environments, states, actions, and rewards, explaining the Markov decision process framework, and highlighting common algorithms like Q‑learning, policy gradients, and actor‑critic methods.

AIMDPalgorithms
0 likes · 4 min read
What Is Reinforcement Learning? Core Concepts Explained
Ctrip Technology
Ctrip Technology
Oct 19, 2017 · Artificial Intelligence

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

This article presents a comprehensive overview of Alibaba’s intelligent chatbot “Ali Xiaomi”, covering industry context, e‑commerce deployment, NLU architecture, intent‑matching layers, deep‑learning‑based intent classification, reinforcement‑learning‑driven recommendation, knowledge‑graph‑enhanced services, and hybrid retrieval‑generation dialogue models, with future outlooks for AI‑driven interaction.

Knowledge GraphNatural Language Understandingdeep learning
0 likes · 18 min read
Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot
ITPUB
ITPUB
Sep 14, 2017 · Artificial Intelligence

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

Salesforce’s recent research introduces Seq2SQL, a reinforcement‑learning‑driven sequence‑to‑sequence model that translates natural‑language questions into SQL queries, eliminating the need to learn SQL, and includes the large WikiSQL dataset built from crowdsourced NL‑SQL pairs for training and evaluation.

AISQL GenerationSeq2SQL
0 likes · 6 min read
How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning
AntTech
AntTech
Aug 4, 2017 · Artificial Intelligence

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

The article summarizes Dr. Qi Yuan’s presentation at CCAI 2017, detailing Ant Financial’s AI‑driven solutions for financial services—including risk control, intelligent assistants, large‑scale machine learning, reinforcement‑learning marketing, a model‑service platform, and a computer‑vision damage‑assessment system—while highlighting technical challenges, platform architecture, and the company’s open‑tech philosophy.

FinTechartificial-intelligencereinforcement learning
0 likes · 16 min read
Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 13, 2017 · Artificial Intelligence

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

This article explains how Alibaba's Cainiao network built the STARK VRP engine—an AI‑driven, distributed vehicle‑routing solver that supports dozens of VRP variants, leverages metaheuristics, parallel island models, and deep reinforcement learning to dramatically reduce fleet size and travel distance in Chinese logistics.

AILogistics OptimizationMetaheuristics
0 likes · 8 min read
How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing
21CTO
21CTO
Jun 29, 2017 · Artificial Intelligence

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

The article explores how machine learning models emulate human learning by converting diverse real‑world descriptions into numerical features, illustrating concepts such as one‑hot encoding, supervised, unsupervised, and reinforcement learning, and emphasizing the importance of mapping inputs to outputs for intelligent systems.

AI conceptsOne-hot encodingfeatures
0 likes · 14 min read
Why Machine Learning Mirrors Human Learning: From Features to Reinforcement
Qunar Tech Salon
Qunar Tech Salon
Apr 27, 2017 · Artificial Intelligence

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

The paper introduces LSTM‑Jump, a reinforcement‑learning‑trained LSTM variant that can dynamically skip irrelevant tokens, achieving up to six‑fold speed‑ups over standard sequential LSTMs while maintaining or improving accuracy on various NLP tasks such as sentiment analysis, document classification, and question answering.

LSTMNLPreinforcement learning
0 likes · 7 min read
LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling
21CTO
21CTO
Apr 19, 2017 · Artificial Intelligence

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba’s e‑commerce search engine evolved over three years from offline batch models to a sophisticated AI-driven system that integrates real‑time feature ingestion, online learning, deep and reinforcement learning, enabling dynamic personalization and decision‑making that boosts conversion during high‑traffic events like Double 11.

AIReal-Time ComputingSearch
0 likes · 15 min read
How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning
Architect
Architect
Mar 10, 2016 · Artificial Intelligence

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

This article explains Monte Carlo Tree Search (MCTS), covering its origin in AlphaGo, fundamental algorithm steps, node‑selection strategies such as UCB, strengths and weaknesses, enhancements, historical background, and recent research developments in artificial intelligence.

MCTSMonte Carlo Tree SearchSearch Algorithms
0 likes · 12 min read
Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications
dbaplus Community
dbaplus Community
Mar 9, 2016 · Artificial Intelligence

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

This article breaks down AlphaGo’s breakthrough architecture—four specialized neural‑network modules, Monte‑Carlo Tree Search, and deep reinforcement learning—to explain how the system moved from imitation learning to self‑improvement and ultimately defeated top human Go players.

AlphaGoGo AIMonte Carlo Tree Search
0 likes · 15 min read
How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery
Architects Research Society
Architects Research Society
Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Bayesian InferenceMachine Translationgenerative models
0 likes · 4 min read
Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data
Baidu Tech Salon
Baidu Tech Salon
Sep 22, 2014 · Artificial Intelligence

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms

After decades of being deemed a 'century‑long' AI challenge, Baidu’s Bingo system achieved amateur‑to‑professional level Go play by introducing optimized Monte‑Carlo tree search, a weakened Alpha‑Beta hybrid, and massive supervised learning, demonstrating how breakthroughs in game AI can ripple into broader Baidu products.

BaiduGo AIMonte Carlo Tree Search
0 likes · 8 min read
How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms