Tagged articles
7 articles
Page 1 of 1
Machine Heart
Machine Heart
May 10, 2026 · Artificial Intelligence

Why SRAM Is Key to Overcoming GPU Limits in Inference as Demand Soars

As large‑model inference demand outpaces training, the decode stage hits a memory‑wall that GPUs cannot efficiently cross; SRAM’s on‑chip bandwidth and low‑energy access open a path forward, though capacity and process limits still pose challenges.

AI hardwareCompute ArchitectureGPU
0 likes · 7 min read
Why SRAM Is Key to Overcoming GPU Limits in Inference as Demand Soars
Architect's Must-Have
Architect's Must-Have
Apr 19, 2026 · Artificial Intelligence

TurboQuant: Google’s 6× KV Compression & 8× Speedup Break the AI Memory Wall

With LLM context windows soaring to millions of tokens, the KV‑cache memory wall threatens scalable inference; Google’s TurboQuant tackles this by compressing KV data up to six‑fold without precision loss and accelerating attention up to eight‑fold, using PolarQuant and 1‑bit QJL techniques, reshaping hardware costs and edge AI possibilities.

AI inferenceKV compressionLarge Language Models
0 likes · 25 min read
TurboQuant: Google’s 6× KV Compression & 8× Speedup Break the AI Memory Wall
Architects' Tech Alliance
Architects' Tech Alliance
Jun 25, 2024 · Industry Insights

Why Storage Is the New Engine Driving AI Server Growth in 2024

The article analyzes how AI servers are shifting from pure compute power to storage‑centric designs, detailing the memory‑wall challenge, the rise of HBM and CXL technologies, vendor market shares, upcoming product roadmaps, and the broader supply‑chain opportunities shaping the AI hardware ecosystem.

AI serversCXLHBM
0 likes · 9 min read
Why Storage Is the New Engine Driving AI Server Growth in 2024
Architects' Tech Alliance
Architects' Tech Alliance
May 1, 2024 · Industry Insights

How CXL Can Break the AI Memory Wall and Boost Data‑Center Performance

The rapid growth of AI models is widening the gap between compute power and memory bandwidth, but the emerging Compute Express Link (CXL) interconnect offers lower latency, memory sharing, and flexible device topologies that can alleviate the memory‑wall bottleneck and reshape future data‑center architectures.

AI computeCXLHigh-speed interconnect
0 likes · 10 min read
How CXL Can Break the AI Memory Wall and Boost Data‑Center Performance
Architects' Tech Alliance
Architects' Tech Alliance
Jan 4, 2020 · Artificial Intelligence

In‑Memory Computing: Overcoming the Memory Wall for AI Chips

The article explains how the memory‑wall limitation of traditional von Neumann architectures hampers AI chip performance, describes two in‑memory computing approaches—circuit‑level modifications and new memory devices—highlights recent conference trends, and showcases a Chinese startup’s 8‑bit low‑power in‑memory AI chip that could enable ubiquitous AI on edge devices.

AI chipsMemory Wallin-memory computing
0 likes · 12 min read
In‑Memory Computing: Overcoming the Memory Wall for AI Chips
dbaplus Community
dbaplus Community
Dec 6, 2017 · Databases

How Multi‑Core CPUs and NVM Are Redefining Database Design

This article summarizes a 2017 Gdevops talk that examines the evolution of many‑core processors and non‑volatile memory, explains how memory‑wall effects impact modern DBMS performance, and presents architectural and algorithmic adaptations—including cache‑friendly structures, lock management, and write‑behind logging—to exploit new hardware.

Memory WallNon-volatile Memorydatabases
0 likes · 18 min read
How Multi‑Core CPUs and NVM Are Redefining Database Design