Jan 6, 2024 · Artificial Intelligence

How to Pick the Best Fine‑Tuning Data for LLMs with the Nuggets Method

This article explains the Nuggets approach for selecting a high‑quality subset of annotated instructions to fine‑tune large language models, describing its three inputs, the gold‑score computation based on perplexity improvement, empirical results on Alpaca, and practical considerations such as task‑set design.

LLMNuggetsdata selection

0 likes · 7 min read

How to Pick the Best Fine‑Tuning Data for LLMs with the Nuggets Method