JD Retail Technology
Dec 4, 2025 · Artificial Intelligence
Twin Networks Reveal How to Optimize Data Mixtures for Large Language Models
This article presents TANDEM, a bi‑level data‑mixture optimization framework that uses twin networks to automatically adjust domain‑specific training data ratios, offering theoretical guarantees, broader applicability, and significant performance gains across pre‑training, fine‑tuning, and e‑commerce product‑understanding tasks.
NeurIPSbi-level optimizationdata mixture optimization
0 likes · 6 min read
