multi-node — 2 Technical Articles

Jul 11, 2025 · Artificial Intelligence

How DeepSeek Achieved 15,800+ Tokens/s: Full‑Stack Inference Optimizations

This article details the Angel‑HCF team's end‑to‑end DeepSeek inference optimizations—including PD separation, multi‑layer MTP, EP and DP parallelism, hardware‑aware kernels, and load‑balancing strategies—that boost throughput to over 15,800 tokens per second while keeping per‑token latency under 50 ms.

AI performanceDeepSeekGPU utilization

0 likes · 13 min read

How DeepSeek Achieved 15,800+ Tokens/s: Full‑Stack Inference Optimizations

MaGe Linux Operations

Jan 28, 2024 · Cloud Native

Mastering Kind: Build, Configure, and Scale Kubernetes Clusters with Docker

This guide explains how to use Kind—a Docker‑based tool for creating Kubernetes test clusters—including its architecture, network model, installation steps, cluster creation commands, multi‑node and multi‑control‑plane configurations, image loading, custom port mappings, ingress deployment, troubleshooting tips, and best‑practice recommendations.

ClusterConfigurationDocker

0 likes · 15 min read

Mastering Kind: Build, Configure, and Scale Kubernetes Clusters with Docker