Getting Started with GPU Remote Invocation Using rCUDA
This article introduces GPU remote invocation, explains rCUDA's architecture, walks through installing the server and client, demonstrates running CUDA samples on a GPU‑less node, and shows how to deploy rCUDA on Kubernetes with example DaemonSet and Job manifests.
rCUDA Overview
rCUDA (remote CUDA) implements the CUDA runtime API in a client‑server architecture, enabling a host without a GPU to execute CUDA kernels on a remote node that runs a CUDA 8.0 environment. It supports TCP/IP and InfiniBand communication. The most recent public release (v16.11.04.02) implements all CUDA 8.0 runtime interfaces; the project is no longer actively maintained.
Demo Setup
Repository with the pre‑built rCUDA binaries and CUDA 8.0 libraries:
https://github.com/lengrongfu/study-demo/tree/main/gpu/rcuda
rCUDA Server
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64
./rCUDAd -h
# shows usage and version (v16.11.04.02)
./rCUDAd -i # start daemon in interactive moderCUDA Client
On a node without a GPU:
export LD_LIBRARY_PATH=/root/rCUDAv16.11.04.02-CUDA8.0/lib
cd Samples/1_Utilities/deviceQuery
make EXTRA_NVCCFLAGS=--cudart=shared
export RCUDA_DEVICE_0=10.20.2.102:0 # remote host IP and GPU index
export RCUDA_DEVICE_COUNT=1
./deviceQueryThe program prints the properties of the remote GPU, confirming successful remote invocation.
rCUDA on Kubernetes (Cloud‑Native Deployment)
A DaemonSet can run an rCUDA server on each GPU‑enabled node. The manifest requests one GPU (nvidia.com/gpu: '1'), enables host networking, and runs the server binary in interactive mode.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: rcuda-server
namespace: default
spec:
selector:
matchLabels:
app: rcuda-server
template:
metadata:
labels:
app: rcuda-server
spec:
hostNetwork: true
containers:
- name: container-1
image: docker.io/lengrongfu/rcuda-server:v0.0.1
ports:
- name: http
containerPort: 8308
protocol: TCP
resources:
limits:
cpu: 250m
memory: 512Mi
nvidia.com/gpu: '1'
restartPolicy: AlwaysDockerfile for the server image builds on an Ubuntu CUDA‑8.0 base, downloads the rCUDA tarball, extracts it, and sets the PATH and LD_LIBRARY_PATH.
FROM nagayosi/ubuntu_gpu_cuda8:latest
RUN apt-get update && apt-get install -y wget
RUN wget -c http://juniorprincewang.github.io/img/rCUDA/rCUDAv16.11.04.02-CUDA8.0-linux64.tgz
RUN tar -zxf rCUDAv16.11.04.02-CUDA8.0-linux64.tgz
ENV PATH=/usr/local/cuda-8.0/bin:$PATH
ENV LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64:$LD_LIBRARY_PATH
WORKDIR /home/nagayosi/rCUDAv16.11.04.02-CUDA8.0/bin
CMD ["./rCUDAd","-i"]A Kubernetes Job can act as an rCUDA client. Setting the environment variable RCUDA_DEVICE_0 to the server’s IP and port (e.g., 192.168.0.1@8308:0) directs the CUDA sample to the remote GPU.
kind: Job
apiVersion: batch/v1
metadata:
name: rcuda-client
namespace: default
spec:
template:
metadata:
labels:
app: rcuda-client
spec:
containers:
- name: container-1
image: docker.io/lengrongfu/rcuda-demo:v0.0.2
command:
- cuda-sample/1_Utilities/deviceQuery/deviceQuery
env:
- name: RCUDA_DEVICE_0
value: 192.168.0.1@8308:0
resources:
limits:
cpu: 250m
memory: 512Mi
requests:
cpu: 250m
memory: 512Mi
restartPolicy: NeverRunning the DaemonSet and Job demonstrates remote GPU access within a Kubernetes cluster.
Reference
[1] cuda‑samples: https://github.com/zchee/cuda-sample.git
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Infra Learning Club
Infra Learning Club shares study notes, cutting-edge technology, and career discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
