Tag

NPU

0 views collected around this technical thread.

JD Tech
JD Tech
Mar 19, 2025 · Artificial Intelligence

JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications

This article details JD Retail's AI engine that seamlessly supports both GPU and domestic NPU hardware, describing its heterogeneous cluster architecture, unified training and inference APIs, performance optimizations, extensive model coverage, and multiple production use cases across e‑commerce, logistics, and intelligent assistance.

AI EngineGPUJD Retail
0 likes · 20 min read
JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications
JD Retail Technology
JD Retail Technology
Mar 4, 2025 · Artificial Intelligence

JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications

JD Retail’s Nine‑Number Algorithm Platform delivers an end‑to‑end AI engine that unifies GPU and domestic NPU resources across a thousand‑card cluster, offering zero‑cost model migration, optimized training and inference pipelines, support for over 40 LLM and multimodal models, and proven business‑level performance that reduces dependence on overseas chips.

AIGPUInference
0 likes · 19 min read
JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications
Bilibili Tech
Bilibili Tech
Mar 4, 2025 · Artificial Intelligence

Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team

The Bilibili TTV team optimized OpenSora and CogVideoX text‑to‑video models by redesigning data storage with Alluxio, parallelizing VAE encoding, applying dynamic sequence‑parallel and DeepSpeed‑Ulysses attention, adapting GPU code for NPU execution, leveraging profiling‑driven kernel fusion, FlashAttention, and expandable memory to dramatically increase training efficiency and frame throughput, while outlining future pipeline‑parallel and ZeRO‑3 scaling plans.

Diffusion TransformerFlashAttentionNPU
0 likes · 26 min read
Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team
JD Tech Talk
JD Tech Talk
Mar 3, 2025 · Artificial Intelligence

AI Engine Technology Based on Domestic Chips for JD Retail

This article describes JD Retail's AI engine built on domestic NPU chips, covering challenges, heterogeneous GPU‑NPU scheduling, high‑performance training and inference engines, extensive model support, real‑world deployment cases, and future plans for large‑scale chip clusters and ecosystem development.

AIGPUInference
0 likes · 20 min read
AI Engine Technology Based on Domestic Chips for JD Retail
Architects' Tech Alliance
Architects' Tech Alliance
Nov 26, 2024 · Artificial Intelligence

Get Ready for a Shakeout in Edge NPUs

The article examines the rapid growth and increasing complexity of edge AI NPUs, discussing challenges in software and hardware acceleration, supply‑chain constraints, and the need for integrated engine solutions to sustain performance and power efficiency.

NPUedge AIhardware acceleration
0 likes · 9 min read
Get Ready for a Shakeout in Edge NPUs
IT Services Circle
IT Services Circle
Feb 1, 2024 · Fundamentals

The Rise of NPU and Integrated Memory in AI PCs and Intel's Lunar Lake Architecture

The article examines how CPUs, GPUs, and memory have long formed the core of PC hardware, discusses the emerging role of NPUs for AI processing, and describes Intel's Lunar Lake strategy of integrating memory with the processor to deliver faster, lower‑latency performance in upcoming AI‑focused PCs.

AI PCCPUGPU
0 likes · 5 min read
The Rise of NPU and Integrated Memory in AI PCs and Intel's Lunar Lake Architecture
Architects' Tech Alliance
Architects' Tech Alliance
Sep 4, 2023 · Artificial Intelligence

Overview of AI Chip Types, Architectures, and Market Trends

The article explains the various AI‑capable chips such as CPUs, GPUs, FPGAs, NPUs, and TPUs, compares their performance and efficiency, describes heterogeneous CPU+xPU solutions, and provides market share data while highlighting the growing adoption of specialized AI accelerators.

AI accelerationAI chipsCPU
0 likes · 7 min read
Overview of AI Chip Types, Architectures, and Market Trends
Alimama Tech
Alimama Tech
Dec 22, 2021 · Artificial Intelligence

Performance Optimization of Advertising Deep Learning Systems: Algorithm, System, and Hardware Co‑Design

The paper presents a holistic algorithm‑system‑hardware co‑design for advertising deep‑learning inference, combining model pruning, approximate computing, kernel fusion, scheduling and PCIe transfer optimizations with GPU and NPU upgrades, achieving up to five‑fold speed‑up and significantly higher latency‑bounded QPS for large‑scale ad services.

Algorithmic OptimizationGPUNPU
0 likes · 24 min read
Performance Optimization of Advertising Deep Learning Systems: Algorithm, System, and Hardware Co‑Design
Tencent Music Tech Team
Tencent Music Tech Team
Apr 30, 2020 · Mobile Development

Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies

Edge deep learning inference on mobile devices faces hardware and software fragmentation, diverse CPUs, GPUs, DSPs, and NPUs, and limited programmability; optimization techniques such as model selection, quantization, and architecture‑specific tuning enable real‑time performance, with most inference on CPUs, GPUs offering 5–10× speedups, and co‑processor support varying across Android and iOS.

DSPGPU programmingNPU
0 likes · 17 min read
Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies
Architects' Tech Alliance
Architects' Tech Alliance
Mar 28, 2020 · Artificial Intelligence

Heterogeneous Computing: Overview of CPU, GPU, FPGA, ASIC, and NPU

This article explains heterogeneous computing and compares major processing units—CPU, GPU, FPGA, ASIC, and NPU—highlighting their architectures, strengths, and typical use cases, especially in deep‑learning and AI workloads.

ASICCPUFPGA
0 likes · 10 min read
Heterogeneous Computing: Overview of CPU, GPU, FPGA, ASIC, and NPU