Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 25, 2024 · Artificial Intelligence

How to Use Importance Sampling for Effective Continue Pretraining of LLMs

Continuing pretraining (CP) bridges pretraining and SFT to inject domain knowledge, but faces catastrophic forgetting; this article explores leveraging importance sampling to balance common and domain data, discusses data selection, annealing strategies, and practical tips for mitigating forgetting while enhancing specialized capabilities.

Catastrophic ForgettingContinue PretrainingDomain Adaptation
0 likes · 8 min read
How to Use Importance Sampling for Effective Continue Pretraining of LLMs