Tagged articles
1 articles
Page 1 of 1
Tencent Advertising Technology
Tencent Advertising Technology
Aug 15, 2024 · Artificial Intelligence

Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language Understanding

This paper introduces RLLR, a label‑sensitive reward reinforcement learning method that improves natural language understanding tasks by aligning training objectives with label accuracy, and demonstrates its effectiveness across eight public NLU datasets and real‑world advertising feature evaluation, outperforming standard RLHF and SFT baselines.

AdvertisingRLHFReinforcement Learning
0 likes · 14 min read
Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language Understanding