Meta Reinforcement Learning Framework for Predictive Autoscaling in Cloud Environments
This article presents a cloud-native, end‑to‑end autoscaling solution that integrates traffic forecasting, CPU utilization meta‑prediction, and a reinforcement‑learning‑based scaling decision module into a fully differentiable system, achieving higher resource utilization and cost efficiency as demonstrated by ACM SIGKDD 2022 research.
Data center resource rationalization is a challenging problem; Ant Group’s applications have low average utilization (<10%). To improve efficiency, the team built an intelligent, fully managed capacity system that performs timed and traffic‑aware predictive autoscaling.
The solution, described in the ACM SIGKDD 2022 paper "A Meta Reinforcement Learning Framework for Predictive Autoscaling in the Cloud," combines traffic prediction and scaling decisions into a single, fully differentiable reinforcement‑learning pipeline, outperforming existing SOTA methods.
Workload Forecaster : A lightweight attentional encoder‑decoder model predicts future traffic from historical data, explicitly decomposing periodicity and leveraging attention for multi‑step forecasts.
CPU Utilization Meta‑Predictor : Using a meta‑learning approach (Attentive Neural Process), a single model maps traffic features to CPU usage across thousands of services, producing task embeddings that inform downstream decisions.
Scaling Decider : A meta model‑based reinforcement‑learning agent treats autoscaling as a Markov Decision Process, where the state includes traffic forecasts, CPU embeddings, and utilization; the reward balances target CPU usage and scaling frequency; actions are scaling ratios. Model‑based RL accelerates convergence by incorporating learned dynamics.
The MDP formulation defines state, reward, action, and transition functions, with the transition model learned from historical data to simulate environment dynamics.
Offline experiments and online deployments show the system maintains CPU utilization around 25% during peak hours and significantly improves resource efficiency, moving towards a serverless‑like experience for online services.
References include works on attentive neural processes, deep reinforcement learning, soft actor‑critic, and model‑based RL.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.