Baobao Algorithm Notes
Jan 6, 2024 · Artificial Intelligence
How to Pick the Best Fine‑Tuning Data for LLMs with the Nuggets Method
This article explains the Nuggets approach for selecting a high‑quality subset of annotated instructions to fine‑tune large language models, describing its three inputs, the gold‑score computation based on perplexity improvement, empirical results on Alpaca, and practical considerations such as task‑set design.
LLMNuggetsdata selection
0 likes · 7 min read
