Mid‑ and Long‑Term Monthly Hotel Room‑Night Forecasting under Pandemic Conditions
This article presents a pandemic‑aware method for predicting national hotel monthly room‑nights over the next six months, detailing data augmentation, feature engineering, LSTM and SARIMA‑LASSO modeling, scenario‑based risk assessment, and evaluation results that demonstrate accurate forecasts despite COVID‑19 disruptions.
Abstract This paper shares a method for mid‑ to long‑term monthly hotel room‑night forecasting under pandemic conditions, noting that traditional time‑series models struggle because COVID‑19 disrupts historical trend and seasonality patterns.
Background Predicting 1‑6 month ahead hotel room‑nights is crucial for budgeting, planning, and decision‑making, yet the pandemic introduces heightened uncertainty and makes accurate forecasting more difficult.
Problem Definition The target is the national hotel monthly room‑night count (excluding cancelled orders). The problem is split into two sub‑tasks: (1) mid‑term forecasting of the next month’s average room‑nights, and (2) long‑term forecasting of the average for months 2‑6 ahead.
Mid‑Term Forecasting Scheme To mitigate data scarcity, a 30‑day sliding window is used to generate daily samples, increasing the training set. Pandemic impact is quantified with a search‑engine pandemic index. Features are grouped into (i) time‑related fixed information (weekday, month, holidays), (ii) external signals (UV, pandemic index), and (iii) historical room‑night variables (last year’s same month, two‑year lag, current pre‑orders). The model employed is an LSTM network. Evaluation uses Mean Absolute Percentage Error (MAPE) computed as avg(|actual − prediction| / actual). Results are shown below:
Month
MAPE
22‑07
0.20%
22‑08
0.21%
22‑10
0.03%
22‑12
0.27%
Model performance on 2021‑09 to 2021‑12 shows MAPE values ranging from 0.08% to 12%, indicating good accuracy for most months.
Long‑Term Forecasting Scheme A scenario‑based approach is adopted. First, a SARIMA model trained on pre‑pandemic data provides a baseline “normal” forecast. Then a LASSO model learns the ratio between normal forecast and actual values using pandemic‑related features, allowing adjustment for different pandemic scenarios (low, medium, high risk). Key features include logarithms of last year’s room‑nights, pre‑order logs, log‑scaled year‑over‑year ratios, search‑engine pandemic index, average daily confirmed cases, and a policy indicator for local New Year restrictions. The workflow consists of five steps: (1) build SARIMA on 2018‑2019 data, (2) generate normal forecasts from 2020 onward, (3) compute modify_rate = normal_forecast / actual, (4) fit modify_rate with the listed features, (5) predict modify_rate under each scenario and obtain final forecasts by dividing normal_forecast by the predicted modify_rate.
Risk thresholds are derived from historical percentiles of the pandemic index and confirmed‑case averages, defining low, medium, and high risk zones. Validation on July‑December 2021 shows that actual values fall within the predicted risk intervals and that the risk classification aligns with observed pandemic severity.
Conclusion and Outlook The study demonstrates that combining data augmentation, pandemic‑aware features, and hybrid SARIMA‑LASSO/LSTM models yields reliable mid‑ and long‑term hotel demand forecasts under COVID‑19. Future work will refine pandemic quantification (e.g., accounting for regional population density and tourism demand) and integrate epidemic trajectory forecasts to further improve scenario‑based predictions.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.