How Alibaba’s DAMO Academy Is Redefining AI with the First 3D‑Stacked Compute‑Memory Chip
On December 3, Alibaba’s DAMO Academy announced its first AI chip that integrates memory and compute using hybrid‑bond 3D stacking, promising ten‑fold performance gains and 300× energy efficiency for AI workloads such as recommendation systems, and marking a shift from traditional von Neumann designs.
On December 3, Alibaba’s DAMO Academy announced the successful development of a compute‑in‑memory AI chip, the world’s first to employ hybrid‑bond 3D stacking for true integration of memory and processing units.
With Moore’s law slowing, compute‑in‑memory is seen as a key solution to the performance bottleneck of modern processors; by merging storage and logic, data movement is reduced, dramatically increasing parallelism and energy efficiency.
The chip’s memory subsystem uses heterogeneous‑integrated embedded DRAM (SeDRAM) offering ultra‑high bandwidth and capacity, while its compute side features a custom streaming accelerator that accelerates recommendation‑system pipelines end‑to‑end, covering matching, coarse ranking, neural‑network inference, and fine ranking.
According to scientist Zheng Hongzhong, the architecture provides high performance, bandwidth and energy efficiency, delivering more than a ten‑fold speedup and up to 300× better energy‑performance ratio for AI workloads such as recommendation systems.
For the past 70 years computers have followed the von Neumann model, requiring data to shuttle between CPU and memory, a pattern that becomes power‑hungry in AI‑heavy scenarios where memory bandwidth lags behind processor advances.
The research was accepted at the top chip conference ISSCC 2022, and the new architecture is expected to be applied in VR/AR, autonomous driving, astronomical data processing, remote‑sensing image analysis and other high‑throughput domains.
Alibaba also showcased its YoC (Chip‑on‑Chip) cloud‑chip source code and a full‑stack IoT platform. YoC, based on a RISC‑V core with SIMD acceleration, was co‑developed with Hangzhou Zhongtian Micro‑Systems in 2015 and supports Bluetooth, Wi‑Fi, audio, multimedia, AI acceleration, mesh networking and motor control.
In October 2021 the open‑source XuanTie core code was released on T‑Head Semiconductor’s GitHub repository.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
