Fundamentals 5 min read

What’s Next for High‑Bandwidth Memory? HBM4‑8 Roadmap 2026‑2038

The article outlines the upcoming high‑bandwidth memory (HBM) generations—from HBM4 to HBM8—detailing their planned release years, data rates, bandwidth, capacity, architectural innovations, and the shift toward memory‑centered computing for AI and high‑performance workloads.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
What’s Next for High‑Bandwidth Memory? HBM4‑8 Roadmap 2026‑2038

Next‑generation HBM technology roadmap (2026‑2038) was presented by the KAIST Memory Systems Laboratory and the TERA Interconnect & Packaging team, covering the evolution from HBM4 to HBM8.

HBM4 : Planned for 2026, it marks the start of modular architecture with 2048 I/O, 8 Gbps data rate, >2 TB/s bandwidth, 12‑16 stack layers, 24 GB per die (36‑48 GB per module). A custom base chip integrates NMC processor and LPDDR controller, boosting system‑level memory capacity by ~40%.

HBM5 : Expected in 2029, focuses on “compute‑near‑memory” 3D heterogeneous architecture. Keeps 8 Gbps rate, expands TSV channels to 4096, raising bandwidth to 4 TB/s and capacity to 80 GB. Introduces 3D near‑memory computing (NMC) with processor cores and L2 cache stacked on DRAM, delivering up to 3× performance for memory‑bound GEMM workloads.

HBM6 : Scheduled for 2032, data rate doubles to 16 Gbps, bandwidth jumps to 8 TB/s, capacity reaches 96‑120 GB. Employs a “four‑tower” structure merging four stack units and a silicon‑glass hybrid interposer for wider GPU connections, doubling LLM inference throughput versus HBM4. Adds an embedded L3 cache, cutting HBM accesses by 73% and reducing overall power by 40%.

HBM7 : Anticipated in 2035, aims for memory‑storage integration by combining HBM with high‑bandwidth flash (HBF) into a heterogeneous storage network. Data rate climbs to 24 Gbps, total bandwidth 24 TB/s, module capacity up to 192 GB, I/O channels 8192. Stacks 128 layers of NAND flash to form HBF, linked to HBM via high‑bandwidth H2F links, creating a 17.6 TB hierarchical storage architecture.

HBM8 : Projected for 2038, based on “full 3D integration” and “memory‑centered computing”. Data rate reaches 32 Gbps, bandwidth 64 TB/s, module capacity 240 GB, I/O channels 16384. GPUs are stacked directly on HBM, halving compute latency by 50% and moving matrix‑operation paths into the memory module, boosting token‑generation speed by nearly 7×.

图片
图片
图片
图片
图片
图片
图片
图片
图片
图片
future technologyAI hardwareHBMHigh Bandwidth Memory
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.