May 17, 2026 · Artificial Intelligence

How CASCADE Enables LLM Agents to Learn from Experience During Live Deployment

The paper introduces CASCADE, a deployment‑time learning framework that lets LLM agents continuously select and reuse past cases via a contextual‑bandit approach, achieving higher long‑term success rates across diverse online tasks without updating the base model.

CASCADECase-Based ReasoningContextual Bandit

0 likes · 10 min read

How CASCADE Enables LLM Agents to Learn from Experience During Live Deployment

58 Tech

Dec 28, 2021 · Artificial Intelligence

Reinforcement Learning for Cold‑Start Job Recommendation in 58.com

This talk explains how 58.com tackles the cold‑start and interest‑divergence problems of its massive blue‑collar job recruitment platform by modeling the recommendation process as a reinforcement‑learning task, detailing the use of multi‑armed bandit, contextual bandit, and linear‑UCB algorithms, offline evaluation pipelines, online deployment, and observed performance gains.

Contextual Banditcold-startjob recommendation

0 likes · 25 min read

Reinforcement Learning for Cold‑Start Job Recommendation in 58.com