Tagged articles
12 articles
Page 1 of 1
Java Tech Enthusiast
Java Tech Enthusiast
Oct 6, 2025 · Artificial Intelligence

How China’s New GPU Startup Moore Thread Is Accelerating the AI Race

Amid US export restrictions, China’s five‑year‑old GPU pioneer Moore Thread is racing to fill the high‑end GPU gap, detailing the technology’s role in AI, its ecosystem strategy, and the significance of its fast‑track IPO for the domestic semiconductor and AI compute landscape.

AI computingChinaGPU
0 likes · 10 min read
How China’s New GPU Startup Moore Thread Is Accelerating the AI Race
Architects' Tech Alliance
Architects' Tech Alliance
May 12, 2025 · Artificial Intelligence

Comparison of Fat-Tree, Dragonfly, and Torus Network Topologies for AI and High‑Performance Computing

The article reviews Fat‑Tree, Dragonfly, and Torus network topologies, analyzing their bandwidth, scalability, latency, routing algorithms, and cost trade‑offs for AI‑driven high‑performance computing clusters, and highlights each design's strengths and limitations in large‑scale deployments.

AI computingDragonflyFat-Tree
0 likes · 12 min read
Comparison of Fat-Tree, Dragonfly, and Torus Network Topologies for AI and High‑Performance Computing
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 18, 2025 · Artificial Intelligence

Alibaba Cloud Showcases Optical Interconnect Innovations at OFC 2025 50th Anniversary

At the OFC 2025 50th anniversary in San Francisco, Alibaba Cloud presented cutting‑edge optical interconnect research and solutions for AI computing and modern data‑center networks, highlighted by invited talks, breakthrough demos, and two data‑driven QoT estimation papers co‑authored with Hong Kong Polytechnic University.

AI computingData centerPhotonic Integration
0 likes · 6 min read
Alibaba Cloud Showcases Optical Interconnect Innovations at OFC 2025 50th Anniversary
Big Data Technology Architecture
Big Data Technology Architecture
Mar 8, 2025 · Artificial Intelligence

Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios

This article explains the three main types of computing power—general (通算), intelligent (智算), and supercomputing (超算)—detailing their definitions, typical processor architectures, and real‑world application scenarios across everyday office tasks, AI workloads, and large‑scale scientific research.

AI computingIntelligent ComputingSupercomputing
0 likes · 8 min read
Understanding General, Intelligent, and Super Computing: Concepts, Processor Types, and Application Scenarios
Architects' Tech Alliance
Architects' Tech Alliance
Sep 17, 2024 · Industry Insights

Why Intelligent Computing Centers Are the Backbone of China’s AI Boom

The article explains what an Intelligent Computing Center (智算中心) is, analyzes its extensive upstream and downstream industry chain, describes the cutting‑edge AI computing architecture that powers it, forecasts massive growth in AI compute capacity by 2028, and outlines regional deployment strategies and service models such as leasing, data, operation, and talent cultivation.

AI InfrastructureAI computingIntelligent Computing Center
0 likes · 11 min read
Why Intelligent Computing Centers Are the Backbone of China’s AI Boom
Architects' Tech Alliance
Architects' Tech Alliance
Sep 12, 2024 · Industry Insights

Managing and Optimizing Large‑Scale AI Compute Clusters: Practical Insights

This article examines the key pain points of massive AI compute clusters—including heterogeneous hardware compatibility, efficient scheduling, training and inference acceleration, and fault‑tolerant operations—while presenting practical management and performance‑tuning strategies, a cloud‑native AI platform implementation, and future directions for the ecosystem.

AI computingCluster ManagementOperations
0 likes · 7 min read
Managing and Optimizing Large‑Scale AI Compute Clusters: Practical Insights
Architects' Tech Alliance
Architects' Tech Alliance
Jul 6, 2024 · Industry Insights

Why Ethernet Struggles with AI Workloads and How Adaptive Routing Solves It

The article analyzes how AI‑driven elephant flows overload traditional Ethernet networks, causing long‑tail latency and victim‑flow congestion, and explains how adaptive routing, RDMA/ RoCE features, advanced congestion‑control algorithms, and high‑capacity switch chips can mitigate these challenges.

AI computingAdaptive routingElephant flow
0 likes · 7 min read
Why Ethernet Struggles with AI Workloads and How Adaptive Routing Solves It
Architects' Tech Alliance
Architects' Tech Alliance
May 23, 2024 · Cloud Computing

Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing

This article analyzes traditional cloud data center network limitations for AI workloads and compares various high‑bandwidth, low‑latency architectures—including two‑layer and three‑layer fat‑tree designs, InfiniBand, and RoCE—providing best‑practice recommendations for building scalable, non‑blocking AI‑Pool networks.

AI computingFat-TreeGPU clusters
0 likes · 12 min read
Design and Comparison of High‑Performance Cloud Data Center Networks for AI Computing
Tencent Cloud Developer
Tencent Cloud Developer
Apr 14, 2023 · Artificial Intelligence

Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training

Tencent Cloud's new HCC high‑performance computing cluster triples previous generation performance with 3.2 TB/s server bandwidth, Xingsha servers and NVIDIA H800 GPUs delivering up to 1979 TFlops, while its Xingmai 3.2 T ETH RDMA network, TB‑level storage via COS + GooseFS, and multi‑form access (bare metal, cloud servers, containers, functions) enable efficient large‑model training.

AI computingGPU clusterHigh‑performance computing
0 likes · 9 min read
Tencent Cloud's Next-Generation HCC High-Performance Computing Cluster for Large Model Training
Baidu Geek Talk
Baidu Geek Talk
Jul 18, 2022 · Artificial Intelligence

GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices

The article surveys GPU container virtualization for AI heterogeneous computing, detailing utilization challenges, historical architectures, various virtualization methods, Baidu's dual-engine user- and kernel-space design with isolation and scheduling features, performance benefits, best‑practice scenarios, and deployment guidance, concluding with a technical Q&A.

AI computingGPU virtualizationMPS
0 likes · 30 min read
GPU Container Virtualization for AI Heterogeneous Computing: Architecture and Best Practices
IT Architects Alliance
IT Architects Alliance
May 23, 2022 · Industry Insights

Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage

The article analyzes how the AI boom and high‑performance SSD storage demand sub‑microsecond latency, exposing TCP/IP’s inherent context‑switch and CPU overhead, and explains why RDMA’s kernel‑bypass, zero‑copy design and 1 µs latency make it the preferred network stack for modern data‑center workloads despite challenges in Ethernet deployment.

AI computingData Center NetworkLow latency
0 likes · 11 min read
Why RDMA Is Replacing TCP/IP for AI and High‑Performance Storage