vivo Internet Technology
Dec 13, 2023 · Artificial Intelligence
Practice of Multi-NIC Container Network Acceleration for Offline Training
The talk explains how Vivo leverages a Kubernetes‑based solution that combines Calico and RoCEv2 to migrate offline training workloads from single‑NIC to multi‑NIC, integrating loss‑less RDMA, planning topology and IP allocation, and employing Volcano, SpiderPool, Macvlan, and Multus CNI for efficient container networking.
Cloud NativeContainer NetworkingKubernetes
0 likes · 4 min read