Tagged articles
3 articles
Page 1 of 1
PaperAgent
PaperAgent
Feb 9, 2026 · Artificial Intelligence

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

AMemGym introduces an on‑policy, interactive benchmark that evaluates and trains AI assistants' long‑term memory by structuring state evolution, diagnosing memory failures, and enabling agents to self‑evolve, revealing that selective memory writing outperforms passive approaches across various LLM and agent architectures.

AI memoryAgentLLM
0 likes · 8 min read
Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 23, 2022 · Industry Insights

What Makes a Good CTR Benchmark? Lessons from Huawei’s FuxiCTR

The article analyzes the shortcomings of current click‑through‑rate benchmarks, explains why leaderboards are valuable, and proposes concrete criteria—including online evaluation, sequential test data, leakage prevention, and read‑only submissions—to build a more realistic and robust CTR benchmarking platform.

AdvertisingCTRleaderboard
0 likes · 6 min read
What Makes a Good CTR Benchmark? Lessons from Huawei’s FuxiCTR
58 Tech
58 Tech
Dec 28, 2021 · Artificial Intelligence

Reinforcement Learning for Cold‑Start Job Recommendation in 58.com

This talk explains how 58.com tackles the cold‑start and interest‑divergence problems of its massive blue‑collar job recruitment platform by modeling the recommendation process as a reinforcement‑learning task, detailing the use of multi‑armed bandit, contextual bandit, and linear‑UCB algorithms, offline evaluation pipelines, online deployment, and observed performance gains.

Contextual BanditReinforcement Learningcold start
0 likes · 25 min read
Reinforcement Learning for Cold‑Start Job Recommendation in 58.com