Tagged articles
1 articles
Page 1 of 1
NewBeeNLP
NewBeeNLP
Feb 5, 2024 · Artificial Intelligence

How HiFT Slashes GPU Memory for LLM Fine‑Tuning with Hierarchical Optimization

HiFT introduces a layer‑wise hierarchical fine‑tuning strategy that freezes most parameters per step, reduces optimizer state memory, and adapts mixed‑precision training, enabling 7B and 13B models to be fine‑tuned on 16‑31 GB GPUs while maintaining competitive performance.

GPU MemoryHiFTLLM fine-tuning
0 likes · 12 min read
How HiFT Slashes GPU Memory for LLM Fine‑Tuning with Hierarchical Optimization