Tagged articles
22 articles
Page 1 of 1
Lao Guo's Learning Space
Lao Guo's Learning Space
May 5, 2026 · Artificial Intelligence

AMD Ryzen AI MAX+ PRO 495 Review: The Most Powerful Mobile APU Yet

The AMD Ryzen AI MAX+ PRO 495 (code‑named Gorgon Halo) boosts memory bandwidth, expands unified memory to up to 256 GB, and delivers 55‑60 TOPS NPU performance, resulting in roughly 4 % multi‑core and 3 % single‑core gains over its predecessor while targeting demanding AI workloads on thin‑and‑light laptops.

AMDMobile APUNPU
0 likes · 9 min read
AMD Ryzen AI MAX+ PRO 495 Review: The Most Powerful Mobile APU Yet
Architects' Tech Alliance
Architects' Tech Alliance
Jan 16, 2026 · Artificial Intelligence

Why Do GPUs and NPUs Produce Different FP16 Results? Uncovering AI Chip Precision Secrets

Engineers training large AI models often see noticeable FP16/BF16 result differences between GPUs and NPUs, and even between generations of the same chip, due to floating‑point representation limits, hardware design choices, software library implementations, compiler optimizations, and parallel execution nondeterminism.

AIGPUNPU
0 likes · 10 min read
Why Do GPUs and NPUs Produce Different FP16 Results? Uncovering AI Chip Precision Secrets
Architects' Tech Alliance
Architects' Tech Alliance
Aug 23, 2025 · Artificial Intelligence

How Huawei’s Ascend Architecture Redefines AI Acceleration

This article examines Huawei's Ascend AI accelerator architecture, detailing its heterogeneous compute units, memory hierarchy, task scheduling, programming model, and chip variants, while also discussing future challenges and the ecosystem needed for widespread AI deployment.

AI acceleratorAI hardwareDaVinci architecture
0 likes · 14 min read
How Huawei’s Ascend Architecture Redefines AI Acceleration
Architects' Tech Alliance
Architects' Tech Alliance
Jul 3, 2025 · Artificial Intelligence

What Makes ASIC Chips the Powerhouse Behind AI? A Deep Dive

This article explains what ASIC chips are, how they differ from CPUs, GPUs and FPGAs, classifies them by customization level and function, outlines their performance and cost advantages, discusses their drawbacks, and reviews current products and market trends driving AI hardware adoption.

AI hardwareASICChip Design
0 likes · 11 min read
What Makes ASIC Chips the Powerhouse Behind AI? A Deep Dive
JD Tech
JD Tech
Mar 19, 2025 · Artificial Intelligence

JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications

This article details JD Retail's AI engine that seamlessly supports both GPU and domestic NPU hardware, describing its heterogeneous cluster architecture, unified training and inference APIs, performance optimizations, extensive model coverage, and multiple production use cases across e‑commerce, logistics, and intelligent assistance.

AI EngineGPUJD Retail
0 likes · 20 min read
JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications
JD Retail Technology
JD Retail Technology
Mar 4, 2025 · Artificial Intelligence

JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications

JD Retail’s Nine‑Number Algorithm Platform delivers an end‑to‑end AI engine that unifies GPU and domestic NPU resources across a thousand‑card cluster, offering zero‑cost model migration, optimized training and inference pipelines, support for over 40 LLM and multimodal models, and proven business‑level performance that reduces dependence on overseas chips.

AIDistributed TrainingGPU
0 likes · 19 min read
JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications
Bilibili Tech
Bilibili Tech
Mar 4, 2025 · Artificial Intelligence

Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team

The Bilibili TTV team optimized OpenSora and CogVideoX text‑to‑video models by redesigning data storage with Alluxio, parallelizing VAE encoding, applying dynamic sequence‑parallel and DeepSpeed‑Ulysses attention, adapting GPU code for NPU execution, leveraging profiling‑driven kernel fusion, FlashAttention, and expandable memory to dramatically increase training efficiency and frame throughput, while outlining future pipeline‑parallel and ZeRO‑3 scaling plans.

Diffusion TransformerFlashAttentionModel Parallelism
0 likes · 26 min read
Engineering Practices and Optimizations for Text‑to‑Video Generation Models (OpenSora, CogVideoX) on Bilibili TTV Team
JD Tech Talk
JD Tech Talk
Mar 3, 2025 · Artificial Intelligence

AI Engine Technology Based on Domestic Chips for JD Retail

This article describes JD Retail's AI engine built on domestic NPU chips, covering challenges, heterogeneous GPU‑NPU scheduling, high‑performance training and inference engines, extensive model support, real‑world deployment cases, and future plans for large‑scale chip clusters and ecosystem development.

AIDistributed TrainingGPU
0 likes · 20 min read
AI Engine Technology Based on Domestic Chips for JD Retail
JD Cloud Developers
JD Cloud Developers
Mar 3, 2025 · Artificial Intelligence

How JD.com Leverages Domestic NPU Chips to Power Large‑Scale AI Models

This article details JD.com's challenges and solutions for deploying domestic NPU chips across heterogeneous GPU‑NPU clusters, covering architecture, scheduling, high‑performance training and inference engines, real‑world case studies, and future plans to scale AI workloads securely and efficiently.

AIDomestic ChipsInference
0 likes · 19 min read
How JD.com Leverages Domestic NPU Chips to Power Large‑Scale AI Models
Infra Learning Club
Infra Learning Club
Feb 6, 2025 · Artificial Intelligence

Getting Started with Huawei Ascend AI Accelerators

This guide walks through the fundamentals of Huawei Ascend NPU hardware, the CANN software stack, driver and firmware installation, Kubernetes integration via Docker runtime and device plugin, and a complete ResNet‑50 inference demo on Ascend 310P.

AI inferenceCANNDocker Runtime
0 likes · 12 min read
Getting Started with Huawei Ascend AI Accelerators
Architects' Tech Alliance
Architects' Tech Alliance
Nov 26, 2024 · Artificial Intelligence

Get Ready for a Shakeout in Edge NPUs

The article examines the rapid growth and increasing complexity of edge AI NPUs, discussing challenges in software and hardware acceleration, supply‑chain constraints, and the need for integrated engine solutions to sustain performance and power efficiency.

NPUSupply Chainedge AI
0 likes · 9 min read
Get Ready for a Shakeout in Edge NPUs
Architects' Tech Alliance
Architects' Tech Alliance
Oct 19, 2024 · Industry Insights

What Is an NPU and Why It’s Shaping the Future of AI PCs

The article explains what Neural Processing Units (NPUs) are, how they differ from CPUs and GPUs, their parallel architecture, the workloads they accelerate, their role in edge AI and AI‑enabled PCs, and why industry analysts expect NPU‑enabled devices to dominate the market by 2026.

AI PCAI acceleratorEdge Computing
0 likes · 8 min read
What Is an NPU and Why It’s Shaping the Future of AI PCs
Architects' Tech Alliance
Architects' Tech Alliance
Jul 3, 2024 · Industry Insights

Why ARM Is Poised to Overtake x86 in the AI PC Era

The report analyzes the accelerating shift from x86 to ARM in AI‑enabled devices, covering architectural differences, market share dynamics, Apple’s successful ARM transition, Microsoft’s ARM ecosystem, Intel’s heterogeneous AI processors, rising memory demands, and future industry forecasts for 2024‑2027.

AI PCARMNPU
0 likes · 17 min read
Why ARM Is Poised to Overtake x86 in the AI PC Era
21CTO
21CTO
May 29, 2024 · Artificial Intelligence

How AI PCs Are Redefining the Desktop: Inside Microsoft’s Copilot+ Vision

Microsoft’s vision of AI PCs, highlighted by the Copilot+ concept, details how integrated NPU hardware, local large‑language models, and the Windows Copilot Runtime enable on‑device AI inference, reducing data‑center load and offering developers a unified platform for building next‑generation AI applications.

AI PCCopilot+Edge Computing
0 likes · 11 min read
How AI PCs Are Redefining the Desktop: Inside Microsoft’s Copilot+ Vision
Architects' Tech Alliance
Architects' Tech Alliance
Sep 4, 2023 · Artificial Intelligence

Overview of AI Chip Types, Architectures, and Market Trends

The article explains the various AI‑capable chips such as CPUs, GPUs, FPGAs, NPUs, and TPUs, compares their performance and efficiency, describes heterogeneous CPU+xPU solutions, and provides market share data while highlighting the growing adoption of specialized AI accelerators.

AI accelerationAI chipsCPU
0 likes · 7 min read
Overview of AI Chip Types, Architectures, and Market Trends
Alimama Tech
Alimama Tech
Dec 22, 2021 · Artificial Intelligence

Performance Optimization of Advertising Deep Learning Systems: Algorithm, System, and Hardware Co‑Design

The paper presents a holistic algorithm‑system‑hardware co‑design for advertising deep‑learning inference, combining model pruning, approximate computing, kernel fusion, scheduling and PCIe transfer optimizations with GPU and NPU upgrades, achieving up to five‑fold speed‑up and significantly higher latency‑bounded QPS for large‑scale ad services.

Algorithmic OptimizationGPUNPU
0 likes · 24 min read
Performance Optimization of Advertising Deep Learning Systems: Algorithm, System, and Hardware Co‑Design
Tencent Music Tech Team
Tencent Music Tech Team
Apr 30, 2020 · Mobile Development

Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies

Edge deep learning inference on mobile devices faces hardware and software fragmentation, diverse CPUs, GPUs, DSPs, and NPUs, and limited programmability; optimization techniques such as model selection, quantization, and architecture‑specific tuning enable real‑time performance, with most inference on CPUs, GPUs offering 5–10× speedups, and co‑processor support varying across Android and iOS.

DSPGPU programmingNPU
0 likes · 17 min read
Edge Deep Learning Inference on Mobile Devices: Challenges, Hardware Diversity, and Optimization Strategies