Tagged articles
49 articles
Page 1 of 1
Machine Heart
Machine Heart
May 18, 2026 · Artificial Intelligence

Consumer‑grade Embodied AI Robot Achieves 1000× Compute, Beats Nvidia Jetson Thor for 1/10 Cost

The new consumer‑grade robot from VeilBlue delivers a thousand‑fold compute boost over previous models, matching Nvidia's Jetson AGX Thor while costing only one‑tenth, thanks to a six‑chip heterogeneous edge cluster, human‑surpassing perception, and safety‑first design validated in real homes.

AI hardwareEmbodied AIRobotics
0 likes · 14 min read
Consumer‑grade Embodied AI Robot Achieves 1000× Compute, Beats Nvidia Jetson Thor for 1/10 Cost
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Dec 31, 2025 · Artificial Intelligence

Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings

The article analyzes the severe inference bottlenecks of large language models, CNNs, and recommendation systems and presents a suite of research‑driven accelerations—including token‑level pipeline parallelism (HPipe), KV‑cache clustering (ClusterAttn), quantization (QoKV), heterogeneous edge frameworks (DeepZoning, PICO), delay‑aware edge‑cloud scheduling (DECC), and operator choreography (RACE)—validated on real‑world industrial workloads.

AI inferenceRecommendation Systemsedge AI
0 likes · 16 min read
Why AI Inference Is Slow and How Cutting‑Edge Tech Boosts It in Industrial Settings
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Dec 16, 2025 · Industry Insights

Why Computer Science Majors Must Embrace a Massive Paradigm Shift

The article argues that traditional storage‑centric computer science curricula are becoming obsolete as AI‑driven, compute‑centric paradigms dominate hardware, data‑center operations, and software ecosystems, urging universities and students to rapidly adopt new teaching focus and skills.

AI hardwareCUDAassociative memory
0 likes · 10 min read
Why Computer Science Majors Must Embrace a Massive Paradigm Shift
JD Tech Talk
JD Tech Talk
Sep 11, 2025 · Artificial Intelligence

How to Seamlessly Migrate AI Workloads from Nvidia GPUs to Domestic Accelerators

This article explains why migrating AI applications from Nvidia GPUs to domestic graphics cards is urgent, outlines the technical challenges, and introduces JoyScale’s zero‑perception migration stack that enables end‑to‑end hardware, software, and model adaptation for reliable, high‑performance AI deployment.

AI migrationJoyScaleModel Optimization
0 likes · 11 min read
How to Seamlessly Migrate AI Workloads from Nvidia GPUs to Domestic Accelerators
JD Cloud Developers
JD Cloud Developers
Sep 11, 2025 · Artificial Intelligence

How to Seamlessly Migrate AI Workloads from Nvidia GPUs to Domestic Accelerators

This article explains why migrating AI applications from Nvidia GPUs to domestic Chinese accelerators is urgent, outlines the technical challenges, and presents JD Cloud's JoyScale zero‑perception migration stack with hardware, software, model, and inference optimizations for real‑world scenarios.

AI migrationJoyScaleModel Quantization
0 likes · 10 min read
How to Seamlessly Migrate AI Workloads from Nvidia GPUs to Domestic Accelerators
Architects' Tech Alliance
Architects' Tech Alliance
May 26, 2025 · Artificial Intelligence

NVLink Fusion: NVIDIA’s High‑Bandwidth Interconnect for Heterogeneous AI Computing

NVLink Fusion, unveiled at Computex 2025, extends NVIDIA’s NVLink technology to enable high‑bandwidth, low‑latency connections between CPUs and GPUs or third‑party accelerators, offering up to 900 GB/s bandwidth, flexible heterogeneous configurations, ecosystem expansion, performance gains for AI training and inference, and potential cost reductions.

AICPUData center
0 likes · 12 min read
NVLink Fusion: NVIDIA’s High‑Bandwidth Interconnect for Heterogeneous AI Computing
Architects' Tech Alliance
Architects' Tech Alliance
Jan 22, 2025 · Artificial Intelligence

Inside Huawei Ascend: How Its Heterogeneous Architecture Powers Modern AI Workloads

This article provides an in‑depth technical analysis of Huawei’s Ascend AI accelerator architecture, detailing its heterogeneous compute units, memory hierarchy, task scheduling, programming model, compiler optimizations, and the capabilities of the Ascend 310 and 910 chips, while also discussing future challenges and market competition.

AI acceleratorAI hardwareHBM
0 likes · 14 min read
Inside Huawei Ascend: How Its Heterogeneous Architecture Powers Modern AI Workloads
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Sep 29, 2024 · Artificial Intelligence

How Baidu’s Baige 4.0 Redefines AI Infrastructure for Large‑Model Training

The article details Baidu Baige 4.0’s four‑layer AI infrastructure—hardware, cluster components, training‑inference acceleration, and platform tools—highlighting its heterogeneous computing, high‑performance networking, fault‑tolerant communication library, and optimizations that boost large‑model training and inference efficiency.

AI InfrastructureHigh‑Performance Networkingheterogeneous computing
0 likes · 17 min read
How Baidu’s Baige 4.0 Redefines AI Infrastructure for Large‑Model Training
Architects' Tech Alliance
Architects' Tech Alliance
Jul 25, 2024 · Artificial Intelligence

NVIDIA H20 AI Chip Launch and the Rapid Growth of China's AI Chip Market

The article reviews NVIDIA's newly released H20 AI accelerator for China, compares its performance and pricing with domestic chips, outlines the expanding Chinese AI chip ecosystem—including Huawei, Cambricon, HaiGuang, Alibaba, ByteDance, and Baidu—while highlighting market size growth, multi‑chip heterogeneity strategies, and the strong demand forecast through 2024.

AI chipsAI computeChina
0 likes · 8 min read
NVIDIA H20 AI Chip Launch and the Rapid Growth of China's AI Chip Market
Architects' Tech Alliance
Architects' Tech Alliance
May 9, 2024 · Artificial Intelligence

AI Servers: Market Opportunities, Architecture, and Future Demand Driven by Generative AI

The article examines how the surge of generative AI (AIGC) is fueling rapid growth in AI server demand, detailing the emerging AIGC ecosystem, server hardware composition, model scaling, heterogeneous computing, training vs. inference workloads, market size forecasts, and the competitive landscape of AI server manufacturers.

AI InfrastructureAI serversGPU
0 likes · 15 min read
AI Servers: Market Opportunities, Architecture, and Future Demand Driven by Generative AI
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 24, 2024 · Artificial Intelligence

How to Build and Accelerate Multi‑Chip AI Clusters for Large‑Model Training

With AI training demands outgrowing single‑chip GPU clusters, this article explains how to construct and speed up heterogeneous AI clusters—combining GPUs, Kunlun, and Ascend chips—by addressing interconnect, distributed parallel strategies, and specialized acceleration suites to achieve high MFU and efficient large‑model training.

AI clusteringDistributed TrainingGPU Acceleration
0 likes · 15 min read
How to Build and Accelerate Multi‑Chip AI Clusters for Large‑Model Training
DataFunSummit
DataFunSummit
Apr 5, 2024 · Big Data

HuoLala Big Data Infrastructure: Challenges, Practices, and Future Outlook

Senior big data engineer Zhu Yaogai from HuoLala shares the team’s three‑year journey, detailing background challenges, the construction of a multi‑layer big‑data infrastructure, solutions for cost efficiency, operational automation, heterogeneous computing, and future plans, illustrating how high cost‑effectiveness, operational efficiency, and analytical performance drive their evolution.

AutomationCloud Nativecost efficiency
0 likes · 11 min read
HuoLala Big Data Infrastructure: Challenges, Practices, and Future Outlook
JD Retail Technology
JD Retail Technology
Feb 1, 2024 · Artificial Intelligence

Evolution and Optimization of JD Retail Advertising Online Model System: From Deep Learning to Distributed Graph Computing and Power Collaboration

The article details JD Retail Advertising's three‑stage evolution of its online model system—deep‑learning era, large‑model era, and power‑collaboration era—highlighting heterogeneous computing optimizations, platform and system capabilities, distributed graph computing, online learning, and dynamic power allocation to dramatically improve algorithm iteration speed and model performance.

AIAdvertisingdistributed graph
0 likes · 13 min read
Evolution and Optimization of JD Retail Advertising Online Model System: From Deep Learning to Distributed Graph Computing and Power Collaboration
Architects' Tech Alliance
Architects' Tech Alliance
Sep 4, 2023 · Artificial Intelligence

Overview of AI Chip Types, Architectures, and Market Trends

The article explains the various AI‑capable chips such as CPUs, GPUs, FPGAs, NPUs, and TPUs, compares their performance and efficiency, describes heterogeneous CPU+xPU solutions, and provides market share data while highlighting the growing adoption of specialized AI accelerators.

AI accelerationAI chipsCPU
0 likes · 7 min read
Overview of AI Chip Types, Architectures, and Market Trends
Architects' Tech Alliance
Architects' Tech Alliance
Jul 29, 2023 · Artificial Intelligence

AI Server Market Overview and Technical Architecture

The article provides a comprehensive analysis of the AI server market, detailing server hardware components, cost distribution, logical architecture, firmware, rapid market growth, competitive landscape, AI-driven heterogeneous computing, and future industry trends, while highlighting key vendors and deployment configurations.

AI serversCloud providersGPU
0 likes · 10 min read
AI Server Market Overview and Technical Architecture
Tencent Cloud Developer
Tencent Cloud Developer
Jul 6, 2023 · Cloud Computing

Hybrid vCPU: Tencent Cloud's Exploration of Virtualizing Heterogeneous CPU Architecture

Tencent Cloud’s Hybrid vCPU research, presented at KVM Forum 2023, outlines a three‑stage roadmap from homogeneous cores to mixed x86, ARM, and RISC‑V CPUs, detailing how virtualizing heterogeneous topologies, frequencies, caches, and PMU features can boost VM performance, security, live‑migration flexibility, and data‑center utilization.

Hybrid CPUKVMLive Migration
0 likes · 25 min read
Hybrid vCPU: Tencent Cloud's Exploration of Virtualizing Heterogeneous CPU Architecture
DataFunTalk
DataFunTalk
May 2, 2023 · Artificial Intelligence

Automatic Parallelism in PaddlePaddle: Architecture, Implementation, and Application Practice

This article presents a comprehensive overview of PaddlePaddle's automatic parallel design for heterogeneous scenarios, covering background motivation, architectural principles, key implementation details, practical usage interfaces, and future outlook, while illustrating concepts with detailed diagrams and examples.

AI frameworksDistributed TrainingPaddlePaddle
0 likes · 19 min read
Automatic Parallelism in PaddlePaddle: Architecture, Implementation, and Application Practice
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 22, 2023 · Artificial Intelligence

CUTLASS Extreme Performance Optimization and Its Application in Alibaba's Recommendation System

At the GTC conference, the talk presents Alibaba Cloud’s heterogeneous computing platform and introduces the Open Deep Learning API (ODLA), then details how CUTLASS‑based operator fusion dramatically accelerates attention and MLP layers in large‑scale recommendation models, achieving multi‑fold performance gains in production.

CUTLASSDeep LearningGPU computing
0 likes · 5 min read
CUTLASS Extreme Performance Optimization and Its Application in Alibaba's Recommendation System
21CTO
21CTO
Oct 1, 2022 · Cloud Computing

Intel’s Open OneAPI and Cloud: Powering Heterogeneous Computing

Intel’s recent announcements, featuring Linus Torvalds, the open‑source OneAPI ecosystem, SYCL‑based cross‑architecture tools, and the new Intel Developer Cloud, illustrate how the company is driving a unified, hardware‑agnostic platform for AI, accelerated computing, and secure cloud workloads.

AIDeveloper CloudIntel
0 likes · 10 min read
Intel’s Open OneAPI and Cloud: Powering Heterogeneous Computing
Tencent Cloud Developer
Tencent Cloud Developer
Sep 30, 2022 · Cloud Computing

Understanding GPU Computing and Cloud-Based GPU Solutions

The article explains how massive parallel pixel calculations demand GPUs, whose high cost and inflexibility are solved by Tencent Cloud’s elastic, virtualized GPU services—including vGPU, qGPU, TACO abstraction, and spot instances—delivering up to 16 EFLOPS for AI, scientific, graphics, and video workloads.

GPU computingTencent Cloudcloud GPU
0 likes · 5 min read
Understanding GPU Computing and Cloud-Based GPU Solutions
Baidu Geek Talk
Baidu Geek Talk
Jul 18, 2022 · Artificial Intelligence

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

The article surveys GPU container virtualization for AI heterogeneous computing, detailing utilization challenges, historical architectures, various virtualization methods, Baidu's dual-engine user- and kernel-space design with isolation and scheduling features, performance benefits, best‑practice scenarios, and deployment guidance, concluding with a technical Q&A.

AI computingGPU virtualizationMPS
0 likes · 30 min read
GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices
Architects' Tech Alliance
Architects' Tech Alliance
Jul 5, 2022 · Fundamentals

Understanding High‑Performance Computing (HPC): Principles, Architecture, and Performance Metrics

This article explains the fundamentals of high‑performance computing, covering serial and parallel processing, heterogeneous CPU‑GPU architectures, FLOPS measurement levels, key terminology, and why HPC is essential for scientific and engineering simulations, while also noting market reports and resource links.

FLOPSHPCHigh‑performance computing
0 likes · 6 min read
Understanding High‑Performance Computing (HPC): Principles, Architecture, and Performance Metrics
Tencent Cloud Developer
Tencent Cloud Developer
Jun 29, 2022 · Fundamentals

C++ Asynchronous Programming: Understanding libunifex and Sender/Receiver Model

This article thoroughly explains libunifex’s sender/receiver model for C++ asynchronous programming, covering its design goals, module structure, pipeline composition, key functions like schedule, then, sync_wait, and the connect/start mechanisms, while demonstrating practical examples and integration with C++20 coroutines and cancellation support.

C++CoroutinesPipeline
0 likes · 16 min read
C++ Asynchronous Programming: Understanding libunifex and Sender/Receiver Model
Architects' Tech Alliance
Architects' Tech Alliance
Jun 24, 2022 · Fundamentals

Post‑Moore Era CPU Trends: From General‑Purpose to Specialized, Heterogeneous Integration, and Edge Computing

The article analyzes how the slowdown of Moore's Law drives a shift from general‑purpose CPUs to specialized XPU, FPGA, DSA and ASIC designs, highlights heterogeneous chiplet integration, edge‑server growth, and the emerging importance of software, algorithms and architecture in boosting performance and efficiency.

AIoTCPUChiplet
0 likes · 15 min read
Post‑Moore Era CPU Trends: From General‑Purpose to Specialized, Heterogeneous Integration, and Edge Computing
Baidu App Technology
Baidu App Technology
Jan 24, 2022 · Mobile Development

Introduction to OpenCL Programming for Mobile GPU Computing

As mobile CPUs plateau, developers increasingly use OpenCL to harness Android GPUs like Qualcomm Adreno and Huawei Mali for heterogeneous computing, leveraging its platform, execution, and memory models to write portable kernels—illustrated by a simple array‑addition example that demonstrates device initialization, kernel creation, buffer management, and parallel execution.

AndroidC programmingGPU computing
0 likes · 8 min read
Introduction to OpenCL Programming for Mobile GPU Computing
Architects' Tech Alliance
Architects' Tech Alliance
Jan 16, 2022 · Industry Insights

How 5G‑Driven Edge Computing Is Redefining Server Requirements

The report analyzes how the rapid growth of cloud computing previously drove server demand, and how the emergence of 5G‑enabled edge computing is now reshaping server architectures, hardware needs, deployment models, and operational challenges, forecasting a significant increase in server volume for edge scenarios.

5GEdge Computingcloud computing
0 likes · 16 min read
How 5G‑Driven Edge Computing Is Redefining Server Requirements
DataFunTalk
DataFunTalk
Dec 1, 2021 · Artificial Intelligence

AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges

The article summarizes Dr. Tang Shan's presentation on AI domain‑specific architectures, covering their background, the explosion of diverse AI hardware designs, and the significant software‑stack challenges that arise from fragmented tools and the need for full‑stack solutions.

AIDSAHardware
0 likes · 14 min read
AI DSA: Architecture Features, Industry Trends, and Software Stack Challenges
Architects' Tech Alliance
Architects' Tech Alliance
Mar 27, 2021 · Cloud Computing

Future Computing Trends: IDC’s Ten Characteristics of Next‑Generation Compute Infrastructure

The article analyzes how compute has evolved from performance‑driven Moore’s‑law scaling to a comprehensive innovation era driven by cloud democratization, heterogeneous accelerators, edge and memory‑centric architectures, outlining IDC’s ten future‑compute characteristics such as flexible deployment, AI‑enabled operations, multi‑cloud, security and next‑gen interconnects.

Edge Computingheterogeneous computing
0 likes · 17 min read
Future Computing Trends: IDC’s Ten Characteristics of Next‑Generation Compute Infrastructure
Baidu Geek Talk
Baidu Geek Talk
Jan 27, 2021 · Cloud Computing

Elastic Nearline Computing Architecture for Leveraging Idle Resources in Baidu's PaaS Platform

Baidu’s elastic nearline computing architecture inserts an asynchronous, resource‑adaptive layer between online and offline processing, dynamically harvesting idle CPU, GPU and Kunlun XPU capacity to pre‑compute complex recommendation and search policies, enabling peak‑shifting, valley‑filling, higher timeliness and significant business growth at low cost.

PaaS resource schedulingPeak Shavingcloud computing
0 likes · 18 min read
Elastic Nearline Computing Architecture for Leveraging Idle Resources in Baidu's PaaS Platform
DataFunTalk
DataFunTalk
May 8, 2020 · Artificial Intelligence

Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems

The article explains how the fourth paradigm's distributed machine learning framework GDBT tackles the massive data, high‑dimensional features, and real‑time requirements of modern recommendation systems by leveraging heterogeneous computing, parameter servers, RDMA networking, and optimized workloads.

GDBTParameter ServerRDMA
0 likes · 18 min read
Distributed Machine Learning Framework GDBT for High‑Dimensional Real‑Time Recommendation Systems
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 28, 2020 · Artificial Intelligence

How Alibaba Cloud Powers AI with Cutting‑Edge Heterogeneous Compute

This article explains how Alibaba Cloud builds a high‑performance AI infrastructure by combining advanced hardware such as Shenlong servers, GPUs, FPGAs, NPUs, and custom interconnects like RDMA, together with virtualization, FPGA‑as‑a‑Service, AIACC, and resource‑pooling technologies to deliver scalable, cost‑effective AI services.

AI hardwareAlibaba CloudFPGA as a Service
0 likes · 20 min read
How Alibaba Cloud Powers AI with Cutting‑Edge Heterogeneous Compute
Architects' Tech Alliance
Architects' Tech Alliance
Nov 23, 2019 · Operations

Green Computing in Data Centers: Definitions, Research Scope, and Energy‑Saving Technologies

The article examines the rapid growth of information system energy consumption, defines green computing, outlines its research focus—especially on data centers, cloud computing and servers—and analyzes a range of energy‑saving technologies such as DVFS, heterogeneous computing, liquid cooling, rack‑level servers, advanced power supply, backplane cooling and fluorocarbon pump AC systems.

DVFSenergy efficiencygreen computing
0 likes · 13 min read
Green Computing in Data Centers: Definitions, Research Scope, and Energy‑Saving Technologies
Tencent Cloud Developer
Tencent Cloud Developer
Oct 16, 2019 · Industry Insights

How 5G Is Driving New Multimedia Standards and the Rise of VVC & 8K

The article explains why 5G forces multimedia standardization, outlines the challenges of emerging formats like 8K, details the VVC (H.266) compression breakthrough, and describes Tencent's role in shaping future heterogeneous‑computing‑based multimedia architectures for cloud gaming and beyond.

5G8KMultimedia Standards
0 likes · 13 min read
How 5G Is Driving New Multimedia Standards and the Rise of VVC & 8K
Architects' Tech Alliance
Architects' Tech Alliance
Sep 20, 2019 · Industry Insights

Why Heterogeneous Parallel Computing Is the Future of High‑Performance Computing

The article explains how heterogeneous parallel computing—distributing tasks across CPUs, GPUs, FPGAs and other accelerators—has become essential after Moore’s law plateau, detailing its principles, hardware and software perspectives, classification of architectures, processing stages, user‑guided versus compiler‑guided methods, and its relevance to AI, cloud and industry workloads.

CPUFPGAGPU
0 likes · 15 min read
Why Heterogeneous Parallel Computing Is the Future of High‑Performance Computing
Architects' Tech Alliance
Architects' Tech Alliance
Apr 23, 2018 · Fundamentals

Why Heterogeneous Parallel Computing Is the Future of High‑Performance Computing

The article explains how heterogeneous parallel computing—leveraging CPUs, GPUs, FPGAs and other specialized units—addresses the performance limits of traditional serial programming by distributing tasks across diverse hardware, detailing its concepts, architectures, development models, and relevance to AI and cloud workloads.

CPUDeep LearningFPGA
0 likes · 9 min read
Why Heterogeneous Parallel Computing Is the Future of High‑Performance Computing
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Dec 31, 2017 · Artificial Intelligence

Maximizing Machine Learning Performance with Heterogeneous Computing Resources

At the 2017 International Workshop on Mathematical Issues in Information Sciences, Alibaba researcher Zhang Weifeng presented a talk on leveraging heterogeneous computing—re‑architected processors, memory‑wall mitigation, and integrated software‑hardware optimization—to dramatically improve machine‑learning performance, highlighting the growing importance of compute resources alongside algorithmic advances.

AIAcceleratorsPerformance Optimization
0 likes · 5 min read
Maximizing Machine Learning Performance with Heterogeneous Computing Resources
Tencent Cloud Developer
Tencent Cloud Developer
Nov 17, 2017 · Artificial Intelligence

Heterogeneous Acceleration for Deep Learning: From CPU Limitations to AI Processors

The article explains why general‑purpose CPUs can no longer meet deep‑learning demands due to intrinsic scaling limits and memory‑bandwidth bottlenecks, and surveys how heterogeneous accelerators—GPUs, FPGAs, ASICs and emerging AI processors with high‑bandwidth memory—provide specialized, high‑parallelism, power‑efficient solutions for both cloud and edge workloads.

AI ProcessorsASICCPU
0 likes · 11 min read
Heterogeneous Acceleration for Deep Learning: From CPU Limitations to AI Processors
Tencent Architect
Tencent Architect
Nov 9, 2017 · Artificial Intelligence

Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design

The article analyzes the limitations of general‑purpose CPUs for deep‑learning workloads, explains how semiconductor scaling and memory‑bandwidth constraints drive the shift toward specialized heterogeneous processors such as GPUs, FPGAs, and ASICs, and discusses the design trade‑offs of embedded versus cloud AI accelerators.

AIASICCPU
0 likes · 13 min read
Why General‑Purpose CPUs Are Inefficient for Deep Learning: Heterogeneous Computing and AI Processor Design