ByteDance Cloud Native
Author

ByteDance Cloud Native

Sharing ByteDance's cloud-native technologies, technical practices, and developer events.

39
Articles
0
Likes
101
Views
0
Comments
Recent Articles

Latest from ByteDance Cloud Native

39 recent articles
ByteDance Cloud Native
ByteDance Cloud Native
Oct 11, 2023 · Cloud Native

How Katalyst Memory Advisor Optimizes Kubernetes Memory Management in Mixed Workloads

This article explains the challenges of memory management in mixed Kubernetes workloads, introduces ByteDance's open‑source Katalyst Memory Advisor, details native allocation and reclamation mechanisms, outlines its architecture and plugins, and describes interference detection and multi‑level mitigation strategies to improve memory utilization and service quality.

KatalystKubernetesResource Optimization
0 likes · 19 min read
How Katalyst Memory Advisor Optimizes Kubernetes Memory Management in Mixed Workloads
ByteDance Cloud Native
ByteDance Cloud Native
Aug 15, 2023 · Cloud Native

What’s New in Katalyst v0.3.0? Core Enhancements Explained

Katalyst v0.3.0 introduces major upgrades including enhanced KCNR API bandwidth isolation, a more extensible task and async execution framework, advanced mixed‑deployment controls, load‑aware resource prediction, and concurrent unit testing, all aimed at improving cloud‑native resource management efficiency.

KatalystKubernetesresource management
0 likes · 4 min read
What’s New in Katalyst v0.3.0? Core Enhancements Explained
ByteDance Cloud Native
ByteDance Cloud Native
Aug 9, 2023 · Cloud Native

How Volcano Engine’s New GPU Sharing Scheduler Boosts AI Workloads by 500%

This article explains Volcano Engine's next‑generation GPU sharing scheduling technology, detailing the two‑layer scheduler, card‑level bin‑pack/spread strategies, system architecture, API definitions, and optimization algorithms that together increase GPU deployment density over 500% and improve utilization by more than 50% for AI workloads.

GPU SchedulingKubernetesmGPU
0 likes · 13 min read
How Volcano Engine’s New GPU Sharing Scheduler Boosts AI Workloads by 500%
ByteDance Cloud Native
ByteDance Cloud Native
Jun 13, 2023 · Artificial Intelligence

How Ray and Cloud‑Native Tech Supercharge Large‑Model Offline Inference

This article explains the challenges of large‑model offline (batch) inference, such as GPU memory limits and distributed scheduling, and shows how Ray’s cloud‑native architecture, model partitioning, and Ray Datasets can be used to build efficient, elastic inference frameworks deployed with KubeRay.

Distributed ComputingGPU memoryRay
0 likes · 18 min read
How Ray and Cloud‑Native Tech Supercharge Large‑Model Offline Inference
ByteDance Cloud Native
ByteDance Cloud Native
Apr 20, 2023 · Cloud Native

How Dragonfly Accelerates Image Distribution and Scales Kubernetes Batch Processing

At KubeCon+CloudNativeCon 2023 in Amsterdam, Volcano Engine and ByteDance presented two technical sessions covering Dragonfly's P2P image distribution best practices and large‑scale Kubernetes batch processing strategies, offering deep insights and real‑world implementations for cloud‑native developers.

Batch ProcessingDragonflyImage Distribution
0 likes · 4 min read
How Dragonfly Accelerates Image Distribution and Scales Kubernetes Batch Processing