Tagged articles

ahead-of-time

1 articles · Page 1 of 1

Dec 11, 2021 · Artificial Intelligence

Nimble: A Lightweight Parallel GPU Scheduler Boosting Deep Learning Performance

The article analyzes how Nimble reduces GPU scheduling overhead and enables parallel execution through ahead‑of‑time scheduling and automatic multi‑stream assignment, achieving up to 22.3× inference speedup over PyTorch and significantly improving GPU utilization for deep learning workloads.

Deep LearningGPU SchedulingParallel Execution

0 likes · 9 min read

Nimble: A Lightweight Parallel GPU Scheduler Boosting Deep Learning Performance