ByteDance Cloud Native
Author

ByteDance Cloud Native

Sharing ByteDance's cloud-native technologies, technical practices, and developer events.

39
Articles
0
Likes
101
Views
0
Comments
Recent Articles

Latest from ByteDance Cloud Native

39 recent articles
ByteDance Cloud Native
ByteDance Cloud Native
Apr 9, 2025 · Artificial Intelligence

How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows

This guide explains how to launch the ComfyUI Cluster Edition on Volcengine, covering its enterprise features such as multi‑user collaboration, resource isolation, built‑in plugins, flexible mounting, and step‑by‑step deployment using VKE, CP, and API Gateway to enable efficient, scalable AI image generation.

AI DeploymentComfyUIMulti-user collaboration
0 likes · 10 min read
How to Deploy ComfyUI Cluster Edition on Volcengine for Multi‑User AI Workflows
ByteDance Cloud Native
ByteDance Cloud Native
Apr 3, 2025 · Operations

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.

APMAPMPlusCloudWeGo
0 likes · 13 min read
How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability
ByteDance Cloud Native
ByteDance Cloud Native
Mar 27, 2025 · Operations

Taming High Cardinality in AI & Autonomous Driving with Prometheus

This article shares practical experience from Volcengine's managed Prometheus service and its deep integration with large‑model and autonomous‑driving platforms, explaining what high cardinality is, its impact on monitoring systems, root causes, and a range of design, collection, and analysis techniques to mitigate it.

AIPrometheusautonomous driving
0 likes · 12 min read
Taming High Cardinality in AI & Autonomous Driving with Prometheus
ByteDance Cloud Native
ByteDance Cloud Native
Mar 20, 2025 · Artificial Intelligence

How to Deploy DeepSeek‑R1 671B on AIBrix: Multi‑Node GPU Inference in Hours

This guide explains how to use the AIBrix distributed inference platform to deploy the massive DeepSeek‑R1 671B model across multiple GPU nodes, covering cluster setup, custom vLLM images, storage options, RDMA networking, autoscaling, request handling, and observability, turning a weeks‑long deployment into an hour‑scale process.

AIBrixDeepSeek-R1Distributed inference
0 likes · 14 min read
How to Deploy DeepSeek‑R1 671B on AIBrix: Multi‑Node GPU Inference in Hours
ByteDance Cloud Native
ByteDance Cloud Native
Mar 13, 2025 · Backend Development

Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System

This article dissects DeepSeek's 3FS parallel file system, detailing its four‑component architecture, high‑throughput RDMA networking, metadata handling with FoundationDB, client access methods, chain replication (CRAQ), custom FFRecord format, and recovery mechanisms, offering a deep technical perspective for storage engineers.

High-performance storageRDMAchain replication
0 likes · 22 min read
Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System
ByteDance Cloud Native
ByteDance Cloud Native
Feb 13, 2025 · Cloud Computing

Deploy the Full‑Size DeepSeek‑R1 Model on Volcengine Cloud with Terraform and Kubernetes

This guide walks you through two practical solutions for deploying the massive DeepSeek‑R1 model on Volcengine Cloud—one using Terraform for a quick two‑node GPU setup and another leveraging cloud‑native multi‑node distributed inference with Kubernetes, covering resource sizing, environment preparation, model download, monitoring, autoscaling, and storage acceleration.

AIKubernetesTerraform
0 likes · 22 min read
Deploy the Full‑Size DeepSeek‑R1 Model on Volcengine Cloud with Terraform and Kubernetes