Architects' Tech Alliance
Feb 22, 2019 · Operations
Performance Monitoring and Analysis in Large‑Scale Data Centers: Challenges and Practices
The article presents Alibaba's experience in large‑scale data‑center performance monitoring, describing the challenges of software and hardware upgrades, the SPEED platform’s estimation‑evaluation‑decision workflow, the RUE metric, and practical insights such as hyper‑threading effects, hardware heterogeneity, and Simpson’s paradox.
Hardware OptimizationJavaPerformance Monitoring
0 likes · 16 min read