Snowball Engineer Team
Oct 17, 2019 · Artificial Intelligence
GPU-Accelerated Model Training Optimizations for Snowball Feed Recommendation System
This article describes the challenges of large‑scale model training for Snowball’s feed recommendation, and details a series of engineering optimizations—including GPU acceleration, multi‑threaded data preparation, TFRecord conversion, compression, and batch‑map reordering—that increased training throughput from 6 k to over 20 k samples per second while reducing CPU and I/O bottlenecks.
GPUTFRecordTensorFlow
0 likes · 15 min read
