Tagged articles

heterogeneous architecture

3 articles · Page 1 of 1

Oct 29, 2025 · Artificial Intelligence

Why China’s AI Chip Industry Is Poised for a Breakthrough – Trends, Challenges, and Future Outlook

This comprehensive analysis examines the strategic importance, technical challenges, innovation pathways, and market landscape of domestic AI chips in China, highlighting key players, regional clusters, core applications such as intelligent computing, autonomous driving, and robotics, and projecting future industry bottlenecks and opportunities.

AI chipsChina semiconductorFP8

0 likes · 18 min read

Why China’s AI Chip Industry Is Poised for a Breakthrough – Trends, Challenges, and Future Outlook

JD Cloud Developers

Mar 14, 2024 · Artificial Intelligence

How JD Retail Boosted Online Recommendation Inference with Distributed Heterogeneous Computing

This article details JD Retail's ad‑tech team's deep‑compute optimizations—including a distributed graph‑based heterogeneous framework, GPU‑focused inference engine enhancements, TensorBatch request aggregation, deep‑learning compiler bucket pre‑compilation, asynchronous compilation, and multi‑stream GPU processing—to overcome high‑concurrency, low‑latency online recommendation challenges.

Deep Learning CompilerDistributed ComputingGPU inference

0 likes · 14 min read

How JD Retail Boosted Online Recommendation Inference with Distributed Heterogeneous Computing

JD Retail Technology

Jan 25, 2024 · Artificial Intelligence

Optimizing High‑Concurrency Online Inference for Recommendation Models with Distributed Heterogeneous Computing and GPU Acceleration

This article describes how JD Retail's advertising technology team tackled the high‑compute demands of modern recommendation models by designing a distributed graph‑partitioned heterogeneous computing framework, introducing TensorBatch request aggregation, leveraging deep‑learning compiler bucketing and asynchronous compilation, and implementing a multi‑stream GPU architecture to dramatically improve online inference throughput and latency.

Deep Learning CompilerDistributed ComputingGPU Acceleration

0 likes · 13 min read

Optimizing High‑Concurrency Online Inference for Recommendation Models with Distributed Heterogeneous Computing and GPU Acceleration