Cloud Native 5 min read

Design Advantages and Implementation Mechanism of nvidia-docker 2.0

This article explains the design shortcomings of the original nvidia-docker, introduces the architecture and components of nvidia-docker 2.0—including the NVIDIA container runtime, libnvidia-container, and their integration with Docker and containerd—and details the container creation flow for both standard and GPU-enabled workloads.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Design Advantages and Implementation Mechanism of nvidia-docker 2.0

In the first part, the article reviews the limitations of the original nvidia-docker 1.0, which tightly couples with the Docker runtime, lacks flexibility, cannot leverage other runtimes or Docker Compose, and does not treat GPUs as schedulable resources.

It then introduces nvidia-docker 2.0, which uses the NVIDIA Container Runtime, nvidia-container-runtime, libnvidia-container, and runc, modifying Docker's /etc/docker/daemon.json to switch the default runtime.

Key components are described: nvidia-docker2.0 (a simple package that changes Docker configuration), nvidia-container-runtime (adds a pre‑start hook to invoke libnvidia-container), libnvidia-container (provides a library and CLI to expose NVIDIA GPUs to containers), and runc (the OCI runtime used by containerd).

The article explains the container creation flow for a normal container ( docker → dockerd → docker‑containerd‑shim → runc → container‑process ) and for a GPU‑enabled container ( docker → dockerd → docker‑containerd‑shim → nvidia‑container‑runtime → container‑process ), highlighting that the NVIDIA runtime replaces the default runtime and uses a hook to check the NVIDIA_VISIBLE_DEVICES environment variable before invoking libnvidia‑container.

Additional background on containerd is provided, outlining its responsibilities such as lifecycle management, image pull/push, storage management, network handling, and invoking runc.

The article concludes with references to relevant GitHub repositories and documentation for further exploration.

dockerkubernetescontainer runtimeGPU containersnvidia-docker
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.