Tagged articles
31 articles
Page 1 of 1
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 30, 2025 · Cloud Native

How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling

The HBox scheduling platform tackles large‑scale AI cluster challenges by introducing a three‑pool resource model, priority‑based preemptive scheduling, network‑topology and NUMA‑aware dispatch, and GPU virtualization techniques like MIG and vGPU, dramatically improving GPU utilization, SLA guarantees, and overall cluster efficiency.

AI clustersGPU schedulingGPU virtualization
0 likes · 24 min read
How HBox Boosts GPU Utilization with Multi‑Pool and NUMA‑Aware Scheduling
Infra Learning Club
Infra Learning Club
Feb 26, 2025 · Cloud Native

How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide

This guide walks new contributors through the HAMI Kubernetes device‑management middleware, covering its core capabilities, repository structure, development environment setup, build steps, and testing procedures using kwok and fake‑gpu to simulate large‑scale GPU scheduling scenarios.

Device PluginGPU virtualizationHAMI
0 likes · 10 min read
How to Contribute to the HAMI Open‑Source Project: A Beginner’s Guide
Infra Learning Club
Infra Learning Club
Jan 23, 2025 · Cloud Native

Getting Started with GPU Remote Invocation Using rCUDA

This article introduces GPU remote invocation, explains rCUDA's architecture, walks through installing the server and client, demonstrates running CUDA samples on a GPU‑less node, and shows how to deploy rCUDA on Kubernetes with example DaemonSet and Job manifests.

CUDADockerGPU remote invocation
0 likes · 7 min read
Getting Started with GPU Remote Invocation Using rCUDA
Infra Learning Club
Infra Learning Club
Jan 20, 2025 · Fundamentals

How GPU Kernel Virtualization Works: A Deep Dive into the cgpu Project

This article explains the principles of GPU kernel virtualization by analyzing the cgpu project's source code, detailing kernel interception of device operations, the driver’s file_operations, module initialization and cleanup, procfs interfaces, scheduling logic, and compilation steps on Ubuntu 22.04.

GPU virtualizationKernel ModuleLinux kernel
0 likes · 5 min read
How GPU Kernel Virtualization Works: A Deep Dive into the cgpu Project
Open Source Linux
Open Source Linux
Dec 18, 2024 · Fundamentals

How GPU Virtualization Powers Multi‑Tenant Computing and Cloud Graphics

GPU virtualization enables multiple tenants to share and isolate GPU resources across graphics rendering, high‑performance computing, and AI workloads, detailing software stack layers, user‑space API interception, kernel‑level device emulation, hardware support like SR‑IOV and MIG, and full GPU passthrough approaches.

GPU virtualizationcloud GPUhardware-virtualization
0 likes · 9 min read
How GPU Virtualization Powers Multi‑Tenant Computing and Cloud Graphics
Architects' Tech Alliance
Architects' Tech Alliance
Oct 27, 2024 · Industry Insights

How GPU Virtualization Works: From User‑Space APIs to Hardware Isolation

This article explains why GPU virtualization is needed, compares resource‑sharing and isolation approaches, and details user‑level API interception, remote API forwarding, half‑virtualization with virtio, kernel‑level driver interception, and hardware‑level solutions such as vGPU, MIG, and AMD MxGPU.

ContainerGPU virtualizationMIG
0 likes · 14 min read
How GPU Virtualization Works: From User‑Space APIs to Hardware Isolation
Architects' Tech Alliance
Architects' Tech Alliance
Oct 23, 2024 · Cloud Computing

NVIDIA vGPU vs AMD MxGPU: Architecture, Scheduling, and Virtualization Trade‑offs

This article explains GPU virtualization, comparing NVIDIA's software‑based vGPU and AMD's hardware‑based MxGPU, detailing their architecture, required hardware, licensing, performance indicators, resource scheduling strategies, slicing limits, and the advantages and drawbacks of each approach for virtualized workloads.

AMD MxGPUGPU virtualizationNVIDIA vGPU
0 likes · 12 min read
NVIDIA vGPU vs AMD MxGPU: Architecture, Scheduling, and Virtualization Trade‑offs
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Aug 30, 2024 · Industry Insights

How GPU Virtualization Powers AI and Cloud Computing: Techniques, Challenges, and Future Directions

This article examines the rapid rise of GPU virtualization as a solution for efficient GPU resource utilization in AI, big data, and high‑performance computing, detailing its concepts, implementation methods across user, kernel, and hardware layers, Kubernetes integration, real‑world use cases, challenges, and emerging research trends.

Device PluginGPU virtualizationKubernetes
0 likes · 25 min read
How GPU Virtualization Powers AI and Cloud Computing: Techniques, Challenges, and Future Directions
Cloud Native Technology Community
Cloud Native Technology Community
Mar 11, 2024 · Cloud Native

Harnessing Nvidia GPUs in Kubernetes: Virtualization, Scheduling & Best Practices

This article explains how to combine Nvidia GPUs with Kubernetes, covering CUDA toolkits, device plugins, GPU virtualization techniques such as Time‑Slicing, MPS and MIG, and advanced scheduling options like Volcano, while also outlining practical deployment steps and performance considerations.

Cloud NativeDevice PluginGPU virtualization
0 likes · 22 min read
Harnessing Nvidia GPUs in Kubernetes: Virtualization, Scheduling & Best Practices
Baidu Geek Talk
Baidu Geek Talk
Aug 2, 2023 · Cloud Native

Baidu Intelligent Cloud GPU Container Virtualization 2.0: Advancements and Full-Scenario Practices

Baidu Intelligent Cloud’s GPU Container Virtualization 2.0 combines user‑mode and kernel‑mode isolation in a dual‑engine design that unifies scheduling of AI compute, rendering and encoding, supports mixed deployment and multi‑scheduler integration, and boosts GPU utilization across inference, offline tasks, autonomous‑driving simulation, and cloud‑gaming workloads.

AI workloadsGPU virtualizationMulti Scheduler
0 likes · 14 min read
Baidu Intelligent Cloud GPU Container Virtualization 2.0: Advancements and Full-Scenario Practices
DataFunSummit
DataFunSummit
Jul 1, 2023 · Artificial Intelligence

Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook

This article introduces Alibaba Cloud's PAI‑DLC, a cloud‑native deep learning platform that integrates machine‑learning capabilities, containerized services, AI‑aware scheduling, GPU virtualization, elastic training with EasyScale, data access, and observability, and discusses its architecture, key features, and future directions.

AI PlatformCloud NativeDeep Learning
0 likes · 16 min read
Alibaba Cloud Native Deep Learning Platform PAI‑DLC: Architecture, Features, and Future Outlook
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jun 29, 2023 · Artificial Intelligence

How Baidu’s Dual‑Engine GPU Container Virtualization Boosts AI, Rendering, and Cloud Gaming

This article explains Baidu Intelligent Cloud’s GPU container virtualization 2.0, detailing its dual‑engine architecture, resource pooling, and scheduling innovations that isolate AI, rendering, and codec workloads, and showcases real‑world scenarios such as online inference, autonomous‑driving simulation, and cloud gaming to improve GPU utilization.

AI workloadsGPU virtualizationKubernetes scheduling
0 likes · 14 min read
How Baidu’s Dual‑Engine GPU Container Virtualization Boosts AI, Rendering, and Cloud Gaming
Baidu Geek Talk
Baidu Geek Talk
Jan 18, 2023 · Industry Insights

Baidu’s AI IaaS for Autonomous Driving: Architecture, Performance & Cost Savings

Baidu’s Baige AI heterogeneous computing platform delivers an end‑to‑end, low‑cost AI IaaS for autonomous driving, covering data cloud, tiered storage, RapidFS caching, AIAK‑Inference and AIAK‑Training acceleration, GPU container virtualization, and remote GPU pooling, achieving up to 5× faster data access, 391% training speedup, 90% inference latency reduction, and 60% simulation cost cut.

AI IaaSGPU virtualizationPerformance Optimization
0 likes · 17 min read
Baidu’s AI IaaS for Autonomous Driving: Architecture, Performance & Cost Savings
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 14, 2022 · Artificial Intelligence

How Cloud‑Native AI Boosts Resource Efficiency with PaddleFlow

This article explains how cloud‑native AI leverages container‑based architectures and advanced scheduling algorithms—such as resource queues, gang scheduling, bin‑packing, GPU topology‑aware and Tor‑aware dispatch—to improve resource and engineering efficiency, and introduces Baidu’s AI workflow engine PaddleFlow with its design, features, and deployment options.

AI workflowCloud Native AIGPU virtualization
0 likes · 25 min read
How Cloud‑Native AI Boosts Resource Efficiency with PaddleFlow
Baidu Geek Talk
Baidu Geek Talk
Aug 31, 2022 · Artificial Intelligence

Baidu Intelligent Cloud Launches Cloud-native AI 2.0 to Accelerate AI Engineering

Baidu Intelligent Cloud’s new Cloud‑native AI 2.0 platform tackles AI engineering bottlenecks by offering hybrid‑parallel large‑model training, flexible GPU virtualization, and an AI Accelerate Kit that boosts training efficiency over 50 % and cuts inference latency up to 63 %, raising GPU utilization from ~13 % to about 50 %.

AIAI accelerationGPU virtualization
0 likes · 15 min read
Baidu Intelligent Cloud Launches Cloud-native AI 2.0 to Accelerate AI Engineering
Baidu Geek Talk
Baidu Geek Talk
Jul 18, 2022 · Artificial Intelligence

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

The article surveys GPU container virtualization for AI heterogeneous computing, detailing utilization challenges, historical architectures, various virtualization methods, Baidu's dual-engine user- and kernel-space design with isolation and scheduling features, performance benefits, best‑practice scenarios, and deployment guidance, concluding with a technical Q&A.

AI computingGPU virtualizationMPS
0 likes · 30 min read
GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jul 13, 2022 · Artificial Intelligence

Unlocking GPU Efficiency: Baidu’s Dual‑Engine Container Virtualization for AI

This article explores Baidu’s cutting‑edge GPU container virtualization architecture, detailing the challenges of low GPU utilization in AI workloads, the dual‑engine (user‑space and kernel‑space) isolation mechanisms, various mixing strategies, performance evaluations, and best‑practice recommendations for maximizing resource efficiency in large‑scale AI deployments.

AI InfrastructureGPU virtualizationMixed Scheduling
0 likes · 31 min read
Unlocking GPU Efficiency: Baidu’s Dual‑Engine Container Virtualization for AI
DataFunSummit
DataFunSummit
Jun 30, 2022 · Artificial Intelligence

MLOps Practices on the Beike Inference Platform: Architecture, Evolution, and Future Plans

This article presents a comprehensive overview of Beike's machine learning platform and its inference service, detailing the platform's architecture, GPU virtualization, cloud‑native migration, MLOps implementation, and future roadmap to achieve cost‑effective, automated AI model deployment at scale.

AICloud NativeGPU virtualization
0 likes · 13 min read
MLOps Practices on the Beike Inference Platform: Architecture, Evolution, and Future Plans
Qingyun Technology Community
Qingyun Technology Community
Aug 12, 2021 · Artificial Intelligence

How Kubernetes Powers Scalable AI: Building an End‑to‑End Machine Learning Platform

This article explores how Kubernetes, enhanced by KubeSphere and serverless technologies, enables efficient AI workloads through GPU virtualization, multi‑cluster management, secure data sandboxes, automated testing, and scalable storage, illustrating a complete lifecycle from data ingestion to model inference.

AIGPU virtualizationKubeSphere
0 likes · 20 min read
How Kubernetes Powers Scalable AI: Building an End‑to‑End Machine Learning Platform
DataFunTalk
DataFunTalk
Jun 13, 2021 · Artificial Intelligence

GPU Virtual Sharing for AI Inference Services on Kubernetes

The article presents a GPU virtual‑sharing solution for AI inference workloads that isolates memory and compute resources via CUDA API interception, integrates with Kubernetes using the open‑source aliyun‑gpushare scheduler, and demonstrates doubled GPU utilization and minimal performance loss across multiple tests.

CUDAGPU virtualizationKubernetes
0 likes · 16 min read
GPU Virtual Sharing for AI Inference Services on Kubernetes
iQIYI Technical Product Team
iQIYI Technical Product Team
May 28, 2021 · Artificial Intelligence

iQIYI GPU Virtual Sharing for AI Inference: Architecture, Isolation, and Scheduling

iQIYI created a custom GPU‑virtual‑sharing system that intercepts CUDA calls to enforce per‑container memory limits, rewrites kernel launches for compute isolation, and integrates with a Kubernetes scheduler extender, allowing multiple AI inference containers to share a single V100 with minimal overhead and more than doubling overall GPU utilization.

AI inferenceCUDAGPU virtualization
0 likes · 16 min read
iQIYI GPU Virtual Sharing for AI Inference: Architecture, Isolation, and Scheduling
Architects' Tech Alliance
Architects' Tech Alliance
May 9, 2021 · Industry Insights

What Are the Key Standards and Challenges Shaping China’s Desktop Cloud Landscape?

This white‑paper‑style analysis examines the rapid growth of desktop cloud in China, outlines its definitions, deployment models, core technologies, protocol choices, GPU virtualization options, security architecture, and proposes standardization needs and policy recommendations to guide the industry forward.

Desktop CloudGPU virtualizationSecurity Architecture
0 likes · 14 min read
What Are the Key Standards and Challenges Shaping China’s Desktop Cloud Landscape?
58 Tech
58 Tech
Oct 28, 2020 · Artificial Intelligence

Optimizing Resource Utilization of 58.com Deep Learning Platform: Practices and Techniques

This article details how 58.com’s end‑to‑end deep‑learning platform was optimized for higher CPU and GPU inference performance using Intel MKL, OpenVINO, mixed TensorFlow deployment, GPU virtualization, and a Prometheus‑Grafana monitoring system, achieving a 37% reduction in GPU usage and a 146% increase in average GPU utilization.

GPU virtualizationIntel MKLKubernetes
0 likes · 12 min read
Optimizing Resource Utilization of 58.com Deep Learning Platform: Practices and Techniques
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 28, 2020 · Artificial Intelligence

How Alibaba Cloud Powers AI with Cutting‑Edge Heterogeneous Compute

This article explains how Alibaba Cloud builds a high‑performance AI infrastructure by combining advanced hardware such as Shenlong servers, GPUs, FPGAs, NPUs, and custom interconnects like RDMA, together with virtualization, FPGA‑as‑a‑Service, AIACC, and resource‑pooling technologies to deliver scalable, cost‑effective AI services.

AI hardwareAlibaba CloudFPGA as a Service
0 likes · 20 min read
How Alibaba Cloud Powers AI with Cutting‑Edge Heterogeneous Compute
Architects' Tech Alliance
Architects' Tech Alliance
Mar 5, 2019 · Cloud Computing

Comprehensive Overview of Server Virtualization Technologies

This article provides an in‑depth technical overview of server virtualization, covering its historical evolution, CPU, memory, I/O and GPU virtualization techniques, hardware‑assisted extensions such as VT‑x/VT‑d/VT‑c, and the classification of virtualization architectures for modern cloud environments.

CPU virtualizationGPU virtualizationI/O virtualization
0 likes · 11 min read
Comprehensive Overview of Server Virtualization Technologies
Architects' Tech Alliance
Architects' Tech Alliance
Aug 9, 2017 · Fundamentals

Understanding NVIDIA GRID vGPU Virtualization and Its Allocation Modes

This article explains NVIDIA GRID vGPU virtualization, detailing how GPUs are partitioned by memory size, the supported hypervisors, the operation of virtual GPU resources, differences between full‑allocation vGPU and GPU pass‑through, licensing requirements, and performance considerations for cloud and data‑center environments.

GPU virtualizationNvidiacloud computing
0 likes · 10 min read
Understanding NVIDIA GRID vGPU Virtualization and Its Allocation Modes
Architects' Tech Alliance
Architects' Tech Alliance
Sep 26, 2016 · Cloud Computing

Comprehensive Overview of Server Virtualization Technologies

This article provides a detailed technical overview of server virtualization, covering its historical roots, CPU, memory, I/O and GPU virtualization techniques, hardware-assisted extensions, and various hypervisor architectures, highlighting why virtualization remains essential in modern cloud computing environments.

CPU virtualizationGPU virtualizationI/O virtualization
0 likes · 12 min read
Comprehensive Overview of Server Virtualization Technologies