How Qunar Automates Hotel Capacity Planning with Predictive Scaling
This article details Qunar's end‑to‑end solution for forecasting traffic spikes, estimating required CPU resources, and automatically scaling hotel services using a combined flow‑calendar, algorithmic prediction, and Ops‑driven auto‑scaling pipeline, improving stability and operational efficiency.
Background
Peak traffic events such as exam‑ticket printing periods, national exams, and major holidays cause sudden surges in hotel booking volume, often exceeding system capacity and leading to performance degradation, throttling, or crashes. Manual capacity calculations for these events are inaccurate and inefficient, prompting the need for an automated, pre‑emptive scaling approach.
Overall Solution
1. System Architecture
The solution integrates three core components:
Flow‑Calendar Platform – aggregates business monitoring data and provides event calendars.
Algorithm Platform – trains predictive models on historical CPU and order data to forecast future CPU demand.
Ops Interface – receives predicted CPU totals, converts them to instance counts, and schedules automatic scaling tasks.
2. Business Process
The event lifecycle consists of nine stages: Pre‑assessment → Pending Evaluation → Evaluating → Evaluation Complete → Task Creation → Scaling → Scale Complete → Review → Closed.
Key steps include:
Collect CPU core data per application and environment from Ops.
Estimate peak order/QPS values for the event and request a prediction from the algorithm platform.
Algorithm returns total CPU cores needed; Ops translates this into the required number of instances and creates timed scaling tasks.
3. Event Types
Three event categories are supported:
EXAM : Exam‑ticket printing spikes (e.g., graduate or civil service exams).
HOLIDAY : Regular holiday travel peaks without abnormal spikes.
ACTIVITY : Promotional activities such as flash sales that cause sudden traffic spikes.
4. Prediction Methodology
The peak business volume is estimated using the formula: Estimated Peak = Baseline × (1 + Business Growth Rate) Business growth rate is calculated as:
Growth Rate = Historical Growth × (1 + Natural Growth + External Impact)Key metrics include historical growth, natural growth, external impact, and baseline values, all derived from manual inputs or automated calculations.
5. Evaluation and Metrics
After scaling, the system evaluates prediction accuracy using:
Coverage Rate = (Predicted ∩ Actual) / Actual
Accuracy Rate = (Predicted ∩ Actual) / Predicted
Additional indicators such as MAPE (0.08), order‑CPU correlation (0.91), and average CPU usage (32.5) are tracked.
6. Scaling Policies
Safety limits are enforced:
Maximum replica limit – prevents creating more instances than downstream resources allow.
Minimum replica limit – ensures a lower bound based on a configurable percentage.
7. Review and Continuous Improvement
Post‑event reviews compare predicted and actual growth to refine models. The model is retrained regularly with recent peak events, and offline validation ensures new models meet or exceed online performance.
8. Project Impact and Value
Key outcomes:
150+ applications onboarded, covering >90% of hotel service CPU capacity.
Average coverage of predicted applications: 96%.
Average prediction accuracy: 89%.
Each peak event saves ~3 person‑days of manual ops, totaling ~270 person‑days annually.
Resource prediction reduces manual provisioning by ~20%.
9. Future Plans
Planned enhancements include expanding intelligent scaling to all applications and infrastructure layers (including physical servers, KVM, DB/Redis), improving algorithm accuracy with AI, and extending the solution across all business lines for enterprise‑wide resource orchestration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
