Tagged articles
2 articles
Page 1 of 1
Architects' Tech Alliance
Architects' Tech Alliance
Feb 22, 2019 · Operations

Performance Monitoring and Analysis in Large‑Scale Data Centers: Challenges and Practices

The article presents Alibaba's experience in large‑scale data‑center performance monitoring, describing the challenges of software and hardware upgrades, the SPEED platform’s estimation‑evaluation‑decision workflow, the RUE metric, and practical insights such as hyper‑threading effects, hardware heterogeneity, and Simpson’s paradox.

BenchmarkingHardware OptimizationJava
0 likes · 16 min read
Performance Monitoring and Analysis in Large‑Scale Data Centers: Challenges and Practices
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 20, 2019 · Operations

Optimizing Large‑Scale Data Center Performance: Alibaba’s SPEED Platform Insights

This article explores how Alibaba tackles the challenges of performance monitoring and analysis in massive data centers, introducing the SPEED platform’s Estimation‑Evaluation‑Decision‑Validation workflow, the RUE metric, hardware heterogeneity issues, and practical lessons such as hyper‑threading pitfalls and Simpson’s paradox.

Data Center PerformanceHardware HeterogeneityPerformance Monitoring
0 likes · 18 min read
Optimizing Large‑Scale Data Center Performance: Alibaba’s SPEED Platform Insights