Baobao Algorithm Notes
Oct 25, 2024 · Artificial Intelligence
How to Use Importance Sampling for Effective Continue Pretraining of LLMs
Continuing pretraining (CP) bridges pretraining and SFT to inject domain knowledge, but faces catastrophic forgetting; this article explores leveraging importance sampling to balance common and domain data, discusses data selection, annealing strategies, and practical tips for mitigating forgetting while enhancing specialized capabilities.
Catastrophic ForgettingContinue PretrainingDomain Adaptation
0 likes · 8 min read
