Tagged articles

AI computing

12 articles · Page 1 of 1

Oct 6, 2025 · Artificial Intelligence

How China’s New GPU Startup Moore Thread Is Accelerating the AI Race

Amid US export restrictions, China’s five‑year‑old GPU pioneer Moore Thread is racing to fill the high‑end GPU gap, detailing the technology’s role in AI, its ecosystem strategy, and the significance of its fast‑track IPO for the domestic semiconductor and AI compute landscape.

AI computingChinaGPU

0 likes · 10 min read

How China’s New GPU Startup Moore Thread Is Accelerating the AI Race

Architects' Tech Alliance

May 12, 2025 · Artificial Intelligence

Comparison of Fat-Tree, Dragonfly, and Torus Network Topologies for AI and High‑Performance Computing

The article reviews Fat‑Tree, Dragonfly, and Torus network topologies, analyzing their bandwidth, scalability, latency, routing algorithms, and cost trade‑offs for AI‑driven high‑performance computing clusters, and highlights each design's strengths and limitations in large‑scale deployments.

AI computingDragonflyFat-Tree

0 likes · 12 min read

Comparison of Fat-Tree, Dragonfly, and Torus Network Topologies for AI and High‑Performance Computing

Alibaba Cloud Infrastructure

Apr 18, 2025 · Artificial Intelligence

Alibaba Cloud Showcases Optical Interconnect Innovations at OFC 2025 50th Anniversary

At the OFC 2025 50th anniversary in San Francisco, Alibaba Cloud presented cutting‑edge optical interconnect research and solutions for AI computing and modern data‑center networks, highlighted by invited talks, breakthrough demos, and two data‑driven QoT estimation papers co‑authored with Hong Kong Polytechnic University.

AI computingCloud NetworkingData Center

0 likes · 6 min read

Alibaba Cloud Showcases Optical Interconnect Innovations at OFC 2025 50th Anniversary

Big Data Technology Architecture

Mar 8, 2025 · Artificial Intelligence

Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios

This article explains the three main types of computing power—general (通算), intelligent (智算), and supercomputing (超算)—detailing their definitions, typical processor architectures, and real‑world application scenarios across everyday office tasks, AI workloads, and large‑scale scientific research.

AI computingIntelligent Computingapplication scenarios

0 likes · 8 min read

Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios

Architects' Tech Alliance

Sep 17, 2024 · Industry Insights

Why Intelligent Computing Centers Are the Backbone of China’s AI Boom

The article explains what an Intelligent Computing Center (智算中心) is, analyzes its extensive upstream and downstream industry chain, describes the cutting‑edge AI computing architecture that powers it, forecasts massive growth in AI compute capacity by 2028, and outlines regional deployment strategies and service models such as leasing, data, operation, and talent cultivation.

AI InfrastructureAI computingIntelligent Computing Center

0 likes · 11 min read

Why Intelligent Computing Centers Are the Backbone of China’s AI Boom

Architects' Tech Alliance

Sep 12, 2024 · Industry Insights

Managing and Optimizing Large‑Scale AI Compute Clusters: Practical Insights

This article examines the key pain points of massive AI compute clusters—including heterogeneous hardware compatibility, efficient scheduling, training and inference acceleration, and fault‑tolerant operations—while presenting practical management and performance‑tuning strategies, a cloud‑native AI platform implementation, and future directions for the ecosystem.

AI computingOperationsPerformance Tuning

0 likes · 7 min read

Managing and Optimizing Large‑Scale AI Compute Clusters: Practical Insights

Architects' Tech Alliance

Jul 6, 2024 · Industry Insights

Why Ethernet Struggles with AI Workloads and How Adaptive Routing Solves It

The article analyzes how AI‑driven elephant flows overload traditional Ethernet networks, causing long‑tail latency and victim‑flow congestion, and explains how adaptive routing, RDMA/ RoCE features, advanced congestion‑control algorithms, and high‑capacity switch chips can mitigate these challenges.

AI computingAdaptive routingElephant flow

0 likes · 7 min read

Why Ethernet Struggles with AI Workloads and How Adaptive Routing Solves It

Architects' Tech Alliance

May 23, 2024 · Cloud Computing

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

This article analyzes traditional cloud data center network limitations for AI workloads and compares various high‑bandwidth, low‑latency architectures—including two‑layer and three‑layer fat‑tree designs, InfiniBand, and RoCE—providing best‑practice recommendations for building scalable, non‑blocking AI‑Pool networks.

AI computingFat-TreeGPU clusters

0 likes · 12 min read

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

Architects' Tech Alliance

Jan 14, 2024 · Industry Insights

Can Chinese GPUs Close the Gap with NVIDIA? 2023 GPGPU Landscape Analysis

2023 GPGPU research framework analysis reveals that while Chinese GPUs like BR100 and TianGai100 can match or exceed NVIDIA A100 in FP32, they still lag in FP64 and INT8 performance, and the domestic software ecosystem based on OpenCL trails far behind NVIDIA's CUDA, shaping short‑and‑term market dynamics.

AI computingCUDAChina

0 likes · 6 min read

Can Chinese GPUs Close the Gap with NVIDIA? 2023 GPGPU Landscape Analysis

Tencent Cloud Developer

Apr 14, 2023 · Artificial Intelligence

Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training

Tencent Cloud's new HCC high‑performance computing cluster triples previous generation performance with 3.2 TB/s server bandwidth, Xingsha servers and NVIDIA H800 GPUs delivering up to 1979 TFlops, while its Xingmai 3.2 T ETH RDMA network, TB‑level storage via COS + GooseFS, and multi‑form access (bare metal, cloud servers, containers, functions) enable efficient large‑model training.

AI computingGPU ClusterHigh-performance computing

0 likes · 9 min read

Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training

Baidu Geek Talk

Jul 18, 2022 · Artificial Intelligence

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

The article surveys GPU container virtualization for AI heterogeneous computing, detailing utilization challenges, historical architectures, various virtualization methods, Baidu's dual-engine user- and kernel-space design with isolation and scheduling features, performance benefits, best‑practice scenarios, and deployment guidance, concluding with a technical Q&A.

AI computingGPU virtualizationMPS

0 likes · 30 min read

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

IT Architects Alliance

May 23, 2022 · Industry Insights

Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage

The article analyzes how the AI boom and high‑performance SSD storage demand sub‑microsecond latency, exposing TCP/IP’s inherent context‑switch and CPU overhead, and explains why RDMA’s kernel‑bypass, zero‑copy design and 1 µs latency make it the preferred network stack for modern data‑center workloads despite challenges in Ethernet deployment.

AI computingData Center NetworkDistributed storage

0 likes · 11 min read

Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage