Amiya: Dynamic Overcommit Component for Bilibili Offline Big Data Cluster
Amiya, a self‑developed dynamic over‑commit component for Bilibili’s offline big‑data cluster, inflates reported resources on under‑utilized nodes and adjusts them when load rises, adding roughly 683 TB of memory and 137 k vCores, boosting per‑node memory by 15 % and CPU usage by over 20 % while keeping eviction rates below 3 %.
This article introduces Amiya, a self‑developed component designed to address resource shortage in Bilibili’s offline big‑data platform. Over the past year, the offline cluster faced two main challenges: rapid expansion of node count leading to high Pending rates, and the need to improve resource utilization without adding physical machines.
Amiya implements dynamic overcommit (超配) on individual physical machines and collaborates with the cloud platform to enable mixed‑deployment (混部). After deployment, Amiya added approximately 683 TB of allocatable Memory and 137 K vCores to the Yarn offline cluster, as shown in Figures 1 and 2.
Architecture (Figure 3) – The system consists of AmiyaContext, StateStoreManager, CheckPointManager, NodeResourceManager, OperatorManager, InspectManager, and AuditManager. Each module’s responsibilities are described, with NodeResourceManager handling the core overcommit logic and OperatorManager providing interaction with Yarn and K8s.
Overcommit Logic – Based on the principle that users request more resources than they actually use, Amiya reports inflated resource amounts to the scheduler when a node’s CPU/Memory usage is low (OverCommit) and reduces reported resources when usage is high (DropOff). The decision process uses thresholds such as OverCommitThreshold, DropOffThreshold, and capacity limits (CPU/MemoryRatio). Three‑level validation (range, magnitude, time interval) prevents excessive oscillation.
Resource‑Limit Optimization – Different machine models (48‑core vs 96‑core) require distinct CPU/Memory ratios. Experiments showed that increasing memory overcommit to 1.5× physical memory improves CPU utilization for 48‑core nodes, while 96‑core nodes still suffer from memory bottlenecks. Adding extra 128 GB memory to 96‑core nodes raised the effective Memory‑to‑CPU ratio and increased CPU usage from ~45 % to ~70 % (Figures 8‑10).
Eviction Strategies – Amiya implements three eviction layers: Container eviction (triggered after DropOff), Application eviction (targeting large‑disk jobs when SSD usage exceeds a threshold), and Node eviction using K8s‑style Taints (OOMTaint, HighLoadTaint, HighDiskTaint, LowResourceTaint, NeedToStopTaint). ExtremeKill is introduced to force eviction of the largest memory‑consuming container when no other containers can be removed.
Mixed‑Deployment Mode – Amiya is deployed as a sidecar inside the NodeManager pod in Yarn‑on‑K8s clusters. It receives the pod’s real resource limits via a Unix domain socket, reads cgroup usage, computes the overcommit target, and updates the NodeManager’s resource allocation (Figures 13‑14).
Results – In the offline main cluster (≈5 000 nodes), Amiya contributed <683 TB memory and 137 K vCores of additional allocatable resources. Daily per‑node gains were 33.26 GB memory (+15.62 %) and 18.56 % CPU usage (+22.04 % for the dominant configuration). Eviction rates stayed low (0.56 %–2.73 %). In mixed‑deployment clusters, CPU utilization rose by ~10 % after full rollout (Figure 17).
Future Work – Plans include kernel‑level OOM handling, finer‑grained application‑level eviction, and a Master‑Worker architecture for global resource profiling and more flexible max‑ratio overcommit.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.