Tagged articles

online evaluation

3 articles · Page 1 of 1

Feb 9, 2026 · Artificial Intelligence

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

AMemGym introduces an on‑policy, interactive benchmark that evaluates and trains AI assistants' long‑term memory by structuring state evolution, diagnosing memory failures, and enabling agents to self‑evolve, revealing that selective memory writing outperforms passive approaches across various LLM and agent architectures.

AI memoryAgentBenchmark

0 likes · 8 min read

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

Baobao Algorithm Notes

Mar 23, 2022 · Industry Insights

What Makes a Good CTR Benchmark? Lessons from Huawei’s FuxiCTR

The article analyzes the shortcomings of current click‑through‑rate benchmarks, explains why leaderboards are valuable, and proposes concrete criteria—including online evaluation, sequential test data, leakage prevention, and read‑only submissions—to build a more realistic and robust CTR benchmarking platform.

AdvertisingCTRLeaderboard

0 likes · 6 min read

What Makes a Good CTR Benchmark? Lessons from Huawei’s FuxiCTR

58 Tech

Dec 28, 2021 · Artificial Intelligence

Reinforcement Learning for Cold‑Start Job Recommendation in 58.com

This talk explains how 58.com tackles the cold‑start and interest‑divergence problems of its massive blue‑collar job recruitment platform by modeling the recommendation process as a reinforcement‑learning task, detailing the use of multi‑armed bandit, contextual bandit, and linear‑UCB algorithms, offline evaluation pipelines, online deployment, and observed performance gains.

Contextual Banditcold-startjob recommendation

0 likes · 25 min read

Reinforcement Learning for Cold‑Start Job Recommendation in 58.com