Tagged articles
2 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Apr 4, 2026 · Artificial Intelligence

Why the Best SFT Checkpoint May Hurt RL Performance: Adaptive Early‑Stop Loss (AESL) for LLM Cold‑Start

The paper reveals that over‑optimizing supervised fine‑tuning (SFT) for large language models can diminish their reinforcement‑learning (RL) potential, proposes an Adaptive Early‑Stop Loss (AESL) that balances accuracy and output diversity during cold‑start, and demonstrates across multiple LLMs that AESL consistently yields superior RL results.

AI trainingAdaptive Early‑Stop LossLLM
0 likes · 11 min read
Why the Best SFT Checkpoint May Hurt RL Performance: Adaptive Early‑Stop Loss (AESL) for LLM Cold‑Start
DataFunTalk
DataFunTalk
Feb 21, 2021 · Artificial Intelligence

Intra‑Ensemble in Neural Networks

This paper proposes an intra‑ensemble strategy that trains multiple sub‑networks within a single neural network using random training operations, width‑depth variations, and parameter sharing, achieving diverse models and improved performance comparable to traditional ensembles while adding only marginal parameter overhead.

Architecture SearchModel DiversityNeural Networks
0 likes · 9 min read
Intra‑Ensemble in Neural Networks