Cloud Computing 5 min read

Flexible Compute Scheduling Practices in the Restaurant Industry: A Yum China Case Study

This article examines the challenges of uneven compute resource distribution across China and presents Yum China's practical approaches—including multi‑unit deployment, dual‑data‑center scheduling, and supporting platforms—to achieve flexible, cost‑effective compute scheduling for the restaurant sector.

Yum! Tech Team

Jan 29, 2024

Flexible Compute Scheduling Practices in the Restaurant Industry: A Yum China Case Study

With the rapid growth of AI, big data, and cloud computing, the demand for compute power has surged while resources remain unevenly distributed; the eastern region faces high electricity costs and limited land, whereas the western region offers abundant clean energy but lacks data and application scenarios, prompting the national "East Data West Compute" strategy.

Yum China, representing the restaurant industry, explores flexible compute scheduling to reduce costs and meet varied industry demands, beginning with an analysis of CPU usage patterns that show peak usage below 40% and idle periods covering two‑thirds of the day.

01 – Current Situation

The CPU monitoring curve reveals that most of the day the CPU load is low, indicating ample capacity for offline computing and task scheduling.

02 – Prerequisites

All business systems, offline computing, and task scheduling services are containerized and support stateless deployment.

Services can be launched within 5 minutes in any IDC unit.

Comprehensive monitoring ensures automatic traffic switchover during failures to keep online services stable.

Failure monitoring and backup plans are required for scheduling failures.

03 – Solution

1. Multi‑unit deployment model for compute scheduling: By isolating systems across multiple units, traffic can be shifted by unit, and compute resources can be scaled up or down accordingly.

2. Dual‑data‑center scheduling model: Services support lossless scaling; for example, reduce pod count from 4n to n at 22:00 and scale back to 4n before the 10:00 peak.

3. Fault or surge handling: During low‑traffic periods, a failed unit can be bypassed via A/B traffic switching; during high‑traffic or unexpected spikes, public cloud can quickly provision a new unit within 5 minutes for load shedding.

04 – Supporting Platforms and Tools

Traffic control platform for automatic or manual traffic scheduling and throttling.

Compute scheduling platform to automate scaling of services across units or data centers.

Monitoring and alerting system to provide real‑time health data for automated traffic decisions.

DTS (Data Transfer Service) for cross‑region database and cache synchronization.

In conclusion, future compute scheduling platforms will face greater opportunities and challenges, requiring advancements in intelligence, automation, security, and standardization to achieve collaborative development and shared benefits.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Resource Optimization multi-region deployment Compute Scheduling Yum China

Written by

Yum! Tech Team

How we support the digital platform of China's largest restaurant group—technology behind hundreds of millions of consumers and over 12,000 stores.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.