Tagged articles

reinforcement learning

743 articles · Page 8 of 8

Sep 13, 2019 · Artificial Intelligence

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Didi’s ride‑hailing dispatch system has progressed from a simple greedy, first‑come‑first‑served matcher to sophisticated batch, chain, and predictive algorithms that use deep‑learning demand forecasts and reinforcement‑learning optimization to assign drivers under complex business rules, boosting response rates and serving over 30 million daily requests.

AIOptimizationRide Hailing

0 likes · 17 min read

Understanding Didi's Ride‑Hailing Dispatch Algorithms: Challenges and Strategies

Alibaba Cloud Developer

Aug 28, 2019 · Artificial Intelligence

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

This article introduces the Exact‑K recommendation problem, highlights its differences from traditional Top‑K approaches, and presents a novel solution combining Graph Attention Networks (GAttN) with Reinforcement Learning from Demonstrations (RLfD), backed by extensive experiments showing superior performance on real-world datasets.

exact-kgraph attention networksmachine learning

0 likes · 14 min read

Exact‑K Recommendation: Graph Attention Networks and RL from Demonstrations Explained

Tencent Cloud Developer

Aug 14, 2019 · Artificial Intelligence

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

From Steve Jobs’s early work at Atari to modern DeepMind breakthroughs, the article traces how video games have grown into a multibillion‑dollar industry that serves as a testbed for AI research, while highlighting current AI techniques for smarter agents, procedural content generation, and the collaborative challenges shaping the future of game development.

Game DevelopmentMonte Carlo Tree Searchartificial-intelligence

0 likes · 25 min read

From Atari to AI: The Evolution of Video Games and Artificial Intelligence

DataFunTalk

Jul 31, 2019 · Artificial Intelligence

Key Characteristics and Practical Improvements of Recommendation Technologies

This article discusses the fundamental traits of recommendation technologies, compares UserCF and ItemCF models, explains matrix factorization and FM, explores negative sampling, CTR/CVR modeling, ensemble methods, and practical considerations such as reinforcement learning and exploration strategies for improving recommendation performance in real-world systems.

matrix factorizationreinforcement learning

0 likes · 11 min read

Key Characteristics and Practical Improvements of Recommendation Technologies

AntTech

Jul 21, 2019 · Artificial Intelligence

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

At SIGIR 2019 in Paris, Alipay presented two AI research papers—one applying reinforcement learning to predict user intent in customer‑service bots and another introducing the unsupervised QUEST method that builds noisy quasi‑knowledge graphs for answering complex multi‑document questions.

AIInformation RetrievalKnowledge Graph

0 likes · 5 min read

Alipay’s SIGIR 2019 Papers: Reinforcement Learning for User Intent Prediction and Unsupervised QUEST for Complex Question Answering

iQIYI Technical Product Team

Jul 12, 2019 · Artificial Intelligence

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

RESA is a real‑time evaluation platform that continuously tests multiple Adaptive Bitrate (ABR) algorithms on live user traffic, introduces a multi‑user QoE metric derived from viewing behavior, reveals trade‑offs between clarity and bandwidth, and proposes the RL‑based ABSbc algorithm to steer bitrate distribution and balance user experience with network cost.

ABRBandwidth ControlEvaluation

0 likes · 23 min read

Real-Time Evaluation System for Adaptive Bitrate (ABR) Algorithms and Controlled Bitrate Distribution

Alibaba Cloud Developer

Jun 27, 2019 · Artificial Intelligence

Generating Personalized E‑commerce Review Replies with Product Information

This paper presents a sequence‑to‑sequence model that fuses product‑detail tables with customer comments, using gated multimodal attention, copy mechanisms and reinforcement learning to automatically produce high‑quality, context‑aware replies for e‑commerce platforms, and validates the approach with extensive experiments on a large Taobao dataset.

Sequence-to-Sequencecopy mechanisme‑commerce

0 likes · 21 min read

Generating Personalized E‑commerce Review Replies with Product Information

Ctrip Technology

Jun 19, 2019 · Artificial Intelligence

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

This article examines the limitations of traditional learning‑to‑rank for Ctrip hotel sorting, introduces reinforcement learning as a remedy, outlines three progressive implementation plans (A, B, C) with algorithm choices and engineering trade‑offs, and presents early experimental findings that demonstrate RL's potential to improve conversion rates.

CtripRLRanking

0 likes · 15 min read

Applying Reinforcement Learning to Hotel Ranking at Ctrip: Challenges, Solutions, and Preliminary Results

AntTech

Jun 10, 2019 · Artificial Intelligence

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

This article presents a model‑based reinforcement learning framework for recommendation systems that uses a generative adversarial user model to simultaneously learn user behavior dynamics and reward functions, enabling efficient Cascading‑DQN policy learning and achieving superior long‑term user rewards and click‑through rates in experiments.

Cascading DQNGenerative Adversarial NetworksUser Modeling

0 likes · 9 min read

Generative Adversarial User Model for Reinforcement Learning‑Based Recommendation Systems

Alibaba Cloud Developer

Apr 1, 2019 · Fundamentals

Must-Read Technical Books Recommended by Alibaba Experts

Alibaba’s senior engineers share their curated list of essential technical books—from software testing and design patterns to AI, machine learning, reinforcement learning, Rust programming, and database architecture—explaining why each title is valuable for developers seeking deeper knowledge and practical insights.

AIdatabase systemsdesign patterns

0 likes · 9 min read

Must-Read Technical Books Recommended by Alibaba Experts

DataFunTalk

Mar 8, 2019 · Artificial Intelligence

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

The article presents an in‑depth overview of Alibaba's intelligent service bot Ali Xiaomì, covering its platform evolution, core NLP techniques such as intent recognition and machine reading comprehension, multi‑turn recommendation strategies, transfer‑learning approaches across domains and languages, and future technical challenges.

AImachine reading comprehensionnatural language processing

0 likes · 11 min read

Alibaba's Intelligent Service Bot (Ali Xiaomì): Platform Overview, Intent Recognition, Machine Reading Comprehension, Multi‑turn Recommendation, and Transfer Learning

Tencent Cloud Developer

Jan 17, 2019 · Artificial Intelligence

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Tencent’s industrial practice shows how a large‑scale offline‑nearline‑online “Shield” recommendation architecture, powered by the DeepR framework built on RCaffe, uses deep semantic embeddings, massive neural networks and reinforcement‑learning decisions to handle billions of daily requests, demonstrating that data richness and engineering capability, not model depth alone, drive performance in big‑data recommendation systems.

Big DataNeural NetworkRCaffe

0 likes · 13 min read

Deep Learning for Big Data Recommendation Systems: Tencent's Industrial Practice

Alibaba Cloud Developer

Jan 15, 2019 · Artificial Intelligence

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

This article details Alibaba.com engineers' application of reinforcement learning, attention mechanisms, and weakly supervised techniques to extract product summaries, improve content quality, and significantly raise SEO rankings, supported by offline experiments, online A/B testing, and future research directions.

AlibabaSEOattention model

0 likes · 16 min read

How Alibaba Engineers Boost SEO with Reinforcement Learning and Attention Models

DataFunTalk

Jan 9, 2019 · Artificial Intelligence

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

This article introduces reinforcement learning fundamentals, contrasts it with supervised learning, and explores its challenges and advantages in natural language processing, including applications such as text classification, relation extraction from noisy data, and weakly supervised topic segmentation, while summarizing key insights and experimental results.

Text ClassificationWeak Supervisionnatural language processing

0 likes · 11 min read

Reinforcement Learning in Natural Language Processing: Concepts, Challenges, and Applications

Alibaba Cloud Developer

Nov 20, 2018 · Artificial Intelligence

How Reinforcement Learning Powers Interactive Search in E‑Commerce

This article explains how reinforcement learning can be modeled and deployed to enable intelligent, interactive product search on e‑commerce platforms, detailing problem definition, system architecture, training methodology, online results, and future research directions.

deep learningdialogue systeme-commerce

0 likes · 17 min read

How Reinforcement Learning Powers Interactive Search in E‑Commerce

iQIYI Technical Product Team

Nov 16, 2018 · Artificial Intelligence

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

This article explains the principles of adaptive bitrate streaming, compares traditional ABR algorithms with a reinforcement‑learning‑based approach, describes its system architecture and training process, and presents QoS evaluation results that show RL‑driven streaming can improve video quality and smoothness.

ABR algorithmsAIQoS evaluation

0 likes · 8 min read

How Reinforcement Learning Transforms Adaptive Bitrate Streaming

Alibaba Cloud Developer

Nov 16, 2018 · Artificial Intelligence

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

This article traces the ten‑year evolution of Alibaba’s e‑commerce search system, detailing four major stages—from the early Pora streaming engine to dual‑link real‑time architectures, the integration of deep and reinforcement learning, and the shift to large‑scale online deep learning—while highlighting the technical drivers and future AI‑enabled search vision.

Searche-commercemachine learning

0 likes · 16 min read

How Alibaba’s Search Engine Evolved Over a Decade of Double‑11: From Offline Models to Real‑Time AI

Meituan Technology Team

Nov 15, 2018 · Artificial Intelligence

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Meituan enhanced its homepage “Guess You Like” recommendation slot by modeling user‑item interactions as a Markov Decision Process and applying an improved DDPG reinforcement‑learning agent that adjusts the ranking trade‑off parameter, uses advantage‑based Q decomposition, shares actor‑critic weights, and runs in a real‑time TensorFlow pipeline, delivering consistent lifts in click‑through, dwell time, and depth.

DDPGMDP ModelingTensorFlow

0 likes · 21 min read

Reinforcement Learning for Meituan's "Guess You Like" Recommendation Ranking

Tencent Cloud Developer

Oct 18, 2018 · Artificial Intelligence

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

This article outlines the essential role of a data scientist and introduces ten fundamental machine‑learning algorithms—including PCA/SVD, OLS and polynomial regression, regularized linear models, K‑Means, logistic regression, SVM, feed‑forward, convolutional and recurrent neural networks, CRFs, ensemble trees, and reinforcement‑learning methods—while linking to popular Python libraries and tutorials.

Decision TreesPCAalgorithms

0 likes · 10 min read

10 Machine Learning Algorithms You Should Know to Become a Data Scientist

Sohu Tech Products

Oct 10, 2018 · Artificial Intelligence

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

This article explains how reinforcement learning, specifically the DDPG algorithm combined with Transformer-based networks, is applied to improve large‑scale news recall systems, detailing the business scenario, algorithm selection, model architecture, speed optimizations, training challenges, and observed online performance gains.

AIDDPGOnline Advertising

0 likes · 13 min read

Optimizing News Recall with DDPG Reinforcement Learning and Transformer Architecture

DataFunTalk

Sep 27, 2018 · Artificial Intelligence

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

The article presents a comprehensive overview of how Shumei Technology leverages machine learning—including supervised, unsupervised, and reinforcement learning methods—across its credit scoring, fraud detection, advertising, and audio content moderation services, highlighting practical challenges, model fusion techniques, and future research directions.

Model Fusionreinforcement learning

0 likes · 12 min read

Applying Machine Learning in Shumei's Business: Supervised, Unsupervised, and Reinforcement Learning Cases

JD Tech

Sep 12, 2018 · Artificial Intelligence

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

The article details JD's third‑generation autonomous delivery robots, covering their multi‑sensor fusion localization, deep‑learning perception, reinforcement‑learning motion control, extensive patent portfolio, and upcoming technical hurdles such as high‑precision mapping and lidar cost, while also inviting public voting for patent awards.

AI navigationJD Logisticsautonomous robots

0 likes · 8 min read

JD Autonomous Delivery Robots: Technologies, Patents, and Future Challenges

Sohu Tech Products

Sep 5, 2018 · Artificial Intelligence

Reinforcement Learning Theory Overview and Its Application to News Recommendation

This article reviews reinforcement learning fundamentals, contrasts it with supervised learning, surveys major RL algorithms such as DDPG and DQN, and details how these methods can be modeled for sequential news recommendation, including system architecture, state‑action definitions, and practical challenges.

AIDDPGDQN

0 likes · 15 min read

Reinforcement Learning Theory Overview and Its Application to News Recommendation

Alibaba Cloud Developer

Jul 13, 2018 · Artificial Intelligence

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

This article analyzes OpenAI’s Retro Contest on Sonic the Hedgehog, explains why reinforcement learning generalization is crucial for AGI, and details the winning team’s joint PPO pipeline, engineering optimizations, training strategies, and final performance compared to human baselines.

OpenAI Retro ContestRL generalizationSonic game

0 likes · 21 min read

How We Won OpenAI’s Retro Contest: Joint PPO Mastery on Sonic Games

WeChat Backend Team

May 11, 2018 · Artificial Intelligence

How PhoenixGo Turned AlphaGo Zero into a Champion AI Using Cloud Resources

PhoenixGo, an open‑source Go AI built on AlphaGo Zero's reinforcement‑learning algorithm, leveraged Tencent's idle cloud servers to achieve professional‑level play, won the 2018 World AI Go Championship, and was released with a strong model for researchers and hobbyists alike.

AICloud ComputingGo

0 likes · 4 min read

How PhoenixGo Turned AlphaGo Zero into a Champion AI Using Cloud Resources

Alibaba Cloud Developer

Apr 23, 2018 · Fundamentals

Top Technical Books Recommended by Alibaba Experts for World Book Day

On World Book Day, nine Alibaba technology veterans share a curated list of essential technical books—covering software testing, design patterns, AI, machine learning, reinforcement learning, Rust, and database architecture—offering concise reasons why each title is valuable for developers and engineers.

Database ArchitectureRust programmingdesign patterns

0 likes · 10 min read

Top Technical Books Recommended by Alibaba Experts for World Book Day

Tencent Cloud Developer

Mar 15, 2018 · Artificial Intelligence

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

The article surveys three recent approaches—Transition State Clustering, Sequential Windowed Inverse Reinforcement Learning, and Deep Discovery of Continuous Options—that automatically segment long‑horizon surgical‑robot demonstrations into sub‑tasks, learn hierarchical policies from limited data, and achieve markedly higher success rates on da Vinci cutting, tension, and needle‑picking tasks.

hierarchical learningimitation learningreinforcement learning

0 likes · 18 min read

Learning Long-Horizon Surgical Robot Tasks via Transition State Clustering, SWIRL, and DDCO

Alibaba Cloud Developer

Feb 5, 2018 · Artificial Intelligence

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

In 2017 Alibaba's AliMe chatbot platform expanded from a single‑company solution to a multilingual, multi‑channel AI service, introducing platform‑level SaaS/PaaS capabilities, a seven‑layer front‑end architecture, modular back‑end design, advanced intent recognition, knowledge‑graph‑driven product management, reinforcement‑learning‑based recommendation, and machine‑reading comprehension for enterprise and consumer use cases.

AI platformAlibabaChatbot

0 likes · 23 min read

How Alibaba’s AliMe Evolved in 2017: AI Architecture, Algorithms, and Real‑World Impact

High Availability Architecture

Dec 11, 2017 · Artificial Intelligence

A Brief History of Computer Chess and Its Role in Artificial Intelligence

This article traces the evolution of computer chess from the 18th‑century automaton "The Turk" through early programs by Turing, Shannon, and McCarthy, to landmark systems like Deep Blue, AlphaGo, and AlphaZero, highlighting key algorithms, milestones, and their impact on AI research.

AI historyAlphaZeroDeep Blue

0 likes · 19 min read

A Brief History of Computer Chess and Its Role in Artificial Intelligence

Hulu Beijing

Dec 6, 2017 · Artificial Intelligence

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

This article explains how deep reinforcement learning, built upon traditional Q‑learning and enhanced with techniques like experience replay, enables agents to play Atari video games directly from raw pixel inputs, illustrating the key differences, processing steps, and the significance of this breakthrough in AI.

AtariQ-Learningdeep Q‑learning

0 likes · 5 min read

How Deep Reinforcement Learning Powers Video Game AI: From Q‑Learning to Atari Mastery

Hulu Beijing

Dec 5, 2017 · Artificial Intelligence

What Is Reinforcement Learning? Core Concepts Explained

This article introduces the fundamental concepts of reinforcement learning, describing its origins, key components such as agents, environments, states, actions, and rewards, explaining the Markov decision process framework, and highlighting common algorithms like Q‑learning, policy gradients, and actor‑critic methods.

AIMDPalgorithms

0 likes · 4 min read

What Is Reinforcement Learning? Core Concepts Explained

Ctrip Technology

Oct 19, 2017 · Artificial Intelligence

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

This article presents a comprehensive overview of Alibaba’s intelligent chatbot “Ali Xiaomi”, covering industry context, e‑commerce deployment, NLU architecture, intent‑matching layers, deep‑learning‑based intent classification, reinforcement‑learning‑driven recommendation, knowledge‑graph‑enhanced services, and hybrid retrieval‑generation dialogue models, with future outlooks for AI‑driven interaction.

Knowledge GraphNatural Language Understandingdeep learning

0 likes · 18 min read

Intelligent Human‑Computer Interaction: Technical Practices of Alibaba’s “Ali Xiaomi” Chatbot

ITPUB

Sep 14, 2017 · Artificial Intelligence

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

Salesforce’s recent research introduces Seq2SQL, a reinforcement‑learning‑driven sequence‑to‑sequence model that translates natural‑language questions into SQL queries, eliminating the need to learn SQL, and includes the large WikiSQL dataset built from crowdsourced NL‑SQL pairs for training and evaluation.

AISQL GenerationSeq2SQL

0 likes · 6 min read

How Salesforce’s Seq2SQL Turns Natural Language into SQL with Reinforcement Learning

AntTech

Aug 4, 2017 · Artificial Intelligence

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

The article summarizes Dr. Qi Yuan’s presentation at CCAI 2017, detailing Ant Financial’s AI‑driven solutions for financial services—including risk control, intelligent assistants, large‑scale machine learning, reinforcement‑learning marketing, a model‑service platform, and a computer‑vision damage‑assessment system—while highlighting technical challenges, platform architecture, and the company’s open‑tech philosophy.

FinTechartificial-intelligencereinforcement learning

0 likes · 16 min read

Key Insights from Ant Financial VP Dr. Qi Yuan’s Talk on the Development and Application of Financial Intelligence at CCAI 2017

Alibaba Cloud Developer

Jul 13, 2017 · Artificial Intelligence

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

This article explains how Alibaba's Cainiao network built the STARK VRP engine—an AI‑driven, distributed vehicle‑routing solver that supports dozens of VRP variants, leverages metaheuristics, parallel island models, and deep reinforcement learning to dramatically reduce fleet size and travel distance in Chinese logistics.

AILogistics OptimizationMetaheuristics

0 likes · 8 min read

How STARK VRP Cuts Chinese Logistics Costs with AI‑Powered Routing

21CTO

Jun 29, 2017 · Artificial Intelligence

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

The article explores how machine learning models emulate human learning by converting diverse real‑world descriptions into numerical features, illustrating concepts such as one‑hot encoding, supervised, unsupervised, and reinforcement learning, and emphasizing the importance of mapping inputs to outputs for intelligent systems.

AI conceptsOne-hot encodingfeatures

0 likes · 14 min read

Why Machine Learning Mirrors Human Learning: From Features to Reinforcement

Qunar Tech Salon

Apr 27, 2017 · Artificial Intelligence

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

The paper introduces LSTM‑Jump, a reinforcement‑learning‑trained LSTM variant that can dynamically skip irrelevant tokens, achieving up to six‑fold speed‑ups over standard sequential LSTMs while maintaining or improving accuracy on various NLP tasks such as sentiment analysis, document classification, and question answering.

LSTMNLPreinforcement learning

0 likes · 7 min read

LSTM‑Jump: Learning to Skim Text for Faster Sequence Modeling

21CTO

Apr 19, 2017 · Artificial Intelligence

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba’s e‑commerce search engine evolved over three years from offline batch models to a sophisticated AI-driven system that integrates real‑time feature ingestion, online learning, deep and reinforcement learning, enabling dynamic personalization and decision‑making that boosts conversion during high‑traffic events like Double 11.

AIReal-Time ComputingSearch

0 likes · 15 min read

How Alibaba Transformed E‑Commerce Search with Real‑Time AI and Reinforcement Learning

Alibaba Cloud Developer

Mar 22, 2017 · Artificial Intelligence

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform

StarCraft, a classic real‑time strategy game, has become a premier testbed for deep reinforcement learning and AI research, and Alibaba’s open‑source Gym StarCraft platform now bridges Python, TensorFlow, Keras and OpenAI Gym to simplify multi‑agent, macro‑strategy development and fair benchmarking.

AlibabaOpenAI GymPython

0 likes · 3 min read

Unlocking StarCraft AI Research with Gym StarCraft: A Python-Friendly RL Platform

Architect

Mar 10, 2016 · Artificial Intelligence

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

This article explains Monte Carlo Tree Search (MCTS), covering its origin in AlphaGo, fundamental algorithm steps, node‑selection strategies such as UCB, strengths and weaknesses, enhancements, historical background, and recent research developments in artificial intelligence.

MCTSMonte Carlo Tree SearchSearch Algorithms

0 likes · 12 min read

Monte Carlo Tree Search (MCTS): Principles, Algorithms, Advantages, and Applications

dbaplus Community

Mar 9, 2016 · Artificial Intelligence

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

This article breaks down AlphaGo’s breakthrough architecture—four specialized neural‑network modules, Monte‑Carlo Tree Search, and deep reinforcement learning—to explain how the system moved from imitation learning to self‑improvement and ultimately defeated top human Go players.

AlphaGoGo AIMonte Carlo Tree Search

0 likes · 15 min read

How AlphaGo’s Deep Neural Networks Achieve Human‑Level Go Mastery

Architects Research Society

Oct 4, 2015 · Artificial Intelligence

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

This NSF‑funded project aims to develop algorithms that incrementally process partially observed data, integrating generative models with reinforcement‑learning policies to decide when to act, applied to simultaneous machine translation and quiz‑bowl style question answering.

Bayesian InferenceMachine Translationgenerative models

0 likes · 4 min read

Bayesian Thinking on Your Feet: Embedding Generative Models in Reinforcement Learning for Sequentially Revealed Data

Baidu Tech Salon

Sep 22, 2014 · Artificial Intelligence

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms

After decades of being deemed a 'century‑long' AI challenge, Baidu’s Bingo system achieved amateur‑to‑professional level Go play by introducing optimized Monte‑Carlo tree search, a weakened Alpha‑Beta hybrid, and massive supervised learning, demonstrating how breakthroughs in game AI can ripple into broader Baidu products.

BaiduGo AIMonte Carlo Tree Search

0 likes · 8 min read

How Baidu’s Bingo AI Cracked the Go Challenge with Novel Algorithms