Code DAO
Code DAO
Dec 11, 2021 · Artificial Intelligence

Nimble: A Lightweight Parallel GPU Scheduler Boosting Deep Learning Performance

The article analyzes how Nimble reduces GPU scheduling overhead and enables parallel execution through ahead‑of‑time scheduling and automatic multi‑stream assignment, achieving up to 22.3× inference speedup over PyTorch and significantly improving GPU utilization for deep learning workloads.

Deep LearningGPU schedulingahead-of-time
0 likes · 9 min read
Nimble: A Lightweight Parallel GPU Scheduler Boosting Deep Learning Performance