Unlocking Resource Efficiency: Alibaba’s Mixed‑Deployment (Co‑location) Strategy
This article explains how Alibaba’s mixed‑deployment (co‑location) technology combines online transaction services and offline compute workloads on shared physical servers, detailing its architecture, scheduling mechanisms, resource‑concession strategies, achieved performance gains, and future directions for large‑scale e‑commerce infrastructure.
Alibaba's mixed‑deployment (co‑location) technology aims to share physical resources between online transaction workloads and offline compute workloads, reducing resource costs while maintaining service quality.
1. Alibaba Mixed‑Deployment Exploration Overview
The motivation behind mixed‑deployment is to balance ever‑growing business demand with rising resource costs by reusing existing idle resources to support new services. The goal is to achieve significant technical benefits once the scale of resources and cost reaches a certain threshold.
1.1 Why Adopt Mixed‑Deployment?
Online e‑commerce traffic spikes during events like Double 11, creating massive resource pressure, while offline batch jobs also consume large amounts of compute. By mixing these workloads, Alibaba can achieve higher overall utilization.
1.2 What Is Mixed‑Deployment (Co‑location)?
Mixed‑deployment means placing different types of services on the same physical resources, providing each with a full‑size resource quota while allowing them to share the underlying hardware.
It involves three steps: resource integration, resource sharing, and controlled competition to ensure high‑priority services receive priority.
1.3 Online‑Offline Mixed‑Deployment
Online services (transactions, payments, browsing) require low latency and cannot tolerate delays, whereas offline services (batch computation, reporting) are latency‑tolerant and can be scheduled flexibly. Their differing characteristics enable time‑based and priority‑based resource sharing.
1.4 Alibaba’s Mixed‑Deployment Timeline
2014: Concept of mixed‑deployment proposed.
2015: Offline testing and prototype simulation.
2016: Initial production rollout on ~200 machines for internal users.
2017: Small‑scale production mixed‑deployment supporting Double 11.
2018: Goal to scale to ten‑thousand‑node clusters.
1.5 Achievements of Large‑Scale Mixed‑Deployment
Mixed‑deployment clusters of several thousand nodes validated during Double 11; CPU utilization on online clusters rose from 10% to 40%.
Offline clusters running online services achieved transaction rates of tens of thousands per second during promotions.
Service interference under mixed‑deployment remained below 5%.
2. Mixed‑Deployment Scheme and Architecture
The architecture consists of four layers: infrastructure, resource pool, scheduling, and business‑level control. Existing online scheduler (Sigma) and offline scheduler (Fuxi) are coordinated by a unified "0‑layer" scheduler that arbitrates resource allocation.
2.1 Overall Architecture
Resources are first merged into a common pool, then allocated by the scheduling layer, and finally isolated at runtime using kernel mechanisms.
2.2 Online Business Deployment Strategy
Online services are packaged as transaction units, each isolated in its own set of containers. To avoid risk, mixed‑deployment is first applied to a limited set of units before scaling globally.
2.3 Cluster Resource Allocation
CPU is time‑sliced between online and offline tasks, with online tasks given higher priority. Memory is dynamically oversold: offline jobs can use the portion of memory reserved for online containers that remains idle, with a safety buffer to protect sudden online spikes.
2.4 Promotion‑Time Resource Concession: Fast Up/Down
During large promotions, online sites are quickly scaled up to high capacity and later scaled down, freeing resources for offline workloads in normal periods.
2.5 Daily Resource Concession: Time‑Sharing
Online traffic follows a strong diurnal pattern. By shrinking online capacity during low‑traffic periods, idle resources are offered to offline jobs, achieving higher overall utilization.
3. Core Mixed‑Deployment Technologies
3.1 Kernel Isolation
Isolation is implemented via cgroups for CPU, memory, I/O, and network, providing separate priority levels for online and offline workloads.
3.2 Resource Scheduling
Online scheduling (Sigma) uses application resource profiles, container packing, and affinity rules, while offline scheduling (Fuxi) handles batch jobs with multi‑level pipelines. The 0‑layer scheduler coordinates both to ensure fair and priority‑aware allocation.
4. Future Outlook
Mixed‑deployment will evolve in three directions: scaling to ten‑thousand‑node clusters, supporting more diverse workloads and hardware (including cloud resources), and achieving finer‑grained resource profiling, real‑time scheduling, and more precise kernel isolation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
