Alibaba Cloud Infrastructure
Mar 24, 2021 · Cloud Computing
LIBRA and CARE: Memory Bandwidth Management and Fault‑Tolerance Innovations Presented at HPCA 2021
The article reviews two HPCA 2021 papers from Alibaba Cloud—LIBRA, a dynamic memory‑bandwidth management framework that boosts data‑center utilization, and CARE, a cache‑based fault‑tolerance architecture that delivers near‑Chipkill reliability with minimal overhead—while also highlighting future research directions in ML systems, quantum computing, and cache computing.
HPCA2021cloud computingdata center reliability
0 likes · 4 min read