Fundamentals 5 min read

What’s Next for HBM? Roadmap from HBM4 to HBM8 (2026‑2038)

This article outlines the next‑generation high‑bandwidth memory (HBM) roadmap from HBM4 to HBM8, detailing launch timelines, architectural shifts, I/O counts, data rates, bandwidth, capacities, and novel integration concepts such as modular stacks, near‑memory compute, four‑tower structures, memory‑storage convergence, and full 3D integration.

Architects' Tech Alliance

Jul 2, 2025

What’s Next for HBM? Roadmap from HBM4 to HBM8 (2026‑2038)

The KAIST Memory Systems Laboratory and TERA Interconnect & Packaging team present an overview of the next‑generation HBM technology roadmap covering HBM4 through HBM8 (2026‑2038).

HBM4: Planned for release in 2026 as the starting point of a modular architecture. It features 2,048 I/O pins, an 8 Gbps data rate, and total bandwidth exceeding 2 TB/s. Stack depth expands to 12 or 16 layers, with a single‑die capacity of 24 Gb and module capacity of 36‑48 GB. A custom base chip integrates an NMC processor and LPDDR controller, boosting system‑level memory capacity by 40%.

HBM5: Expected in 2029, shifting focus to a 3‑D heterogeneous "compute‑close‑to‑memory" architecture. It retains the 8 Gbps rate while increasing TSV channels to 4,096, raising bandwidth to 4 TB/s and capacity to 80 GB. The design enables 3‑D near‑memory compute (NMC), stacking processor cores and L2 cache dies atop DRAM, delivering up to a 3× performance gain for memory‑bound GEMM workloads.

HBM6: Scheduled for 2032, with a data rate of 16 Gbps and bandwidth of 8 TB/s, and capacities of 96‑120 GB. It adopts a "four‑tower" structure that integrates four stack units and connects to GPUs via a wide‑band silicon interposer, more than doubling LLM inference throughput compared to HBM4. A silicon‑glass hybrid interposer addresses size and cost limits, and an embedded L3 cache cuts HBM accesses by 73% while reducing overall energy consumption by 40%.

HBM7: Anticipated in 2035, targeting memory‑storage integration. It combines HBM with high‑bandwidth flash (HBF) in a heterogeneous storage network. Data rates rise to 24 Gbps, total bandwidth reaches 24 TB/s, and single‑module capacity climbs to 192 GB with 8,192 I/O channels. The architecture stacks 128 layers of NAND flash as HBF, linked to HBM via a high‑bandwidth H2F link, forming a 17.6 TB hierarchical storage system.

HBM8: Projected for 2038, built around "full 3‑D integration" and "memory‑centric computing". It pushes data rates to 32 Gbps, total bandwidth to 64 TB/s, and module capacity to 240 GB, while I/O channels double to 16,384. GPUs are stacked directly on HBM, cutting compute latency by 50% and moving matrix‑operation paths into the HBM module, boosting token‑generation rates by nearly 7×.

For the full original article, see Overview of Next Generation HBM Architectures .

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Architecture HBM high bandwidth memory technology roadmap future hardware

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.