Tagged articles

Mooncake

3 articles · Page 1 of 1

Apr 29, 2026 · Artificial Intelligence

Deploy DeepSeek‑V4 on Ascend NPU with Kthena in 3 Minutes (Prefill‑Decode Separation)

This guide walks through deploying the DeepSeek‑V4‑Flash model on Ascend NPU using Kthena’s ModelRoute, detailing the Prefill‑Decode (P/D) separation architecture, KV cache transfer via Mooncake, configuration of ModelServing and ModelRoute resources, and flexible scaling of Prefill and Decode replicas for optimal performance.

Ascend NPUDeepSeek-V4KV cache

0 likes · 22 min read

Deploy DeepSeek‑V4 on Ascend NPU with Kthena in 3 Minutes (Prefill‑Decode Separation)

Alibaba Cloud Developer

Dec 24, 2025 · Artificial Intelligence

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Large language model inference faces memory pressure, but by externalizing KVCache with Mooncake and orchestrating roles via the Kubernetes‑native RoleBasedGroup (RBG), developers can achieve stable, high‑throughput, cost‑effective serving with seamless in‑place upgrades and topology‑aware performance.

AI InfrastructureKVCacheKubernetes

0 likes · 21 min read

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Python Programming Learning Circle

Sep 8, 2022 · Big Data

Analyzing Mid‑Autumn Festival Mooncake Sales on Taobao with Python

This article demonstrates how to collect, clean, and visualize Taobao mooncake sales data using Python libraries such as Pandas, Pyecharts, jieba and collections, revealing top‑selling flavors, regional distribution, price ranges and shop rankings through step‑by‑step data‑processing and charting techniques.

MooncakePandasPyecharts

0 likes · 4 min read

Analyzing Mid‑Autumn Festival Mooncake Sales on Taobao with Python

Mooncake

Deploy DeepSeek‑V4 on Ascend NPU with Kthena in 3 Minutes (Prefill‑Decode Separation)

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Analyzing Mid‑Autumn Festival Mooncake Sales on Taobao with Python

Deploy DeepSeek‑V4 on Ascend NPU with Kthena in 3 Minutes (Prefill‑Decode Separation)