Tagged articles

7 articles

Page 1 of 1

May 5, 2026 · Artificial Intelligence

Top DIY AI Supercomputer Builds 2026: RTX 5090 & GB300 from $300‑$100k

Analyzing the cost‑benefit of building personal AI supercomputers, the article compares cloud GPU rentals to DIY setups across budgets from $300 to $100k, detailing component choices such as RTX 5090, GB300, Mac Studio, and DGX Spark, while highlighting performance gains, ROI timelines, and common build pitfalls.

AI workstationDIY supercomputerGB300

0 likes · 14 min read

Top DIY AI Supercomputer Builds 2026: RTX 5090 & GB300 from $300‑$100k

Architects' Tech Alliance

Aug 20, 2025 · Artificial Intelligence

Dual ToR and Dual‑Plane Designs: Boosting AI Training Performance in Large‑Scale Data Centers

The article explains how non‑stacked dual‑ToR and dual‑plane network architectures, combined with single‑chip high‑performance switches and multi‑rail host networking, dramatically improve reliability, load balance, and end‑to‑end training speed for massive AI models such as GPT‑3 175B.

AI networkingGPU trainingdata center

0 likes · 11 min read

Dual ToR and Dual‑Plane Designs: Boosting AI Training Performance in Large‑Scale Data Centers

Kuaishou Large Model

Jul 11, 2024 · Artificial Intelligence

Pipeline-Aware Offloading & Balanced Checkpointing Accelerate LLM Training

Researchers from Kwai’s large-model team present a novel training system that combines pipeline-parallel-aware activation offloading with a compute-memory balanced checkpointing strategy, enabling lossless acceleration of large language models, achieving up to 42.7% MFU on 256 NVIDIA H800 GPUs while reducing memory usage.

GPU trainingKwaiLarge Language Models

0 likes · 13 min read

Pipeline-Aware Offloading & Balanced Checkpointing Accelerate LLM Training

Ximalaya Technology Team

Oct 23, 2023 · Artificial Intelligence

HybridBackend Accelerates GPU-Based Recommendation Model Training for Ximalaya AI Cloud

Ximalaya AI Cloud adopted the open‑source HybridBackend framework to overcome sparse‑data bottlenecks, enabling columnar Parquet reads and hybrid parallel GPU training that boost GPU utilization by over threefold, cut recommendation model training time by more than half, and now powers all TensorFlow and DeepRec production models.

AI cloudDistributed TrainingGPU training

0 likes · 8 min read

HybridBackend Accelerates GPU-Based Recommendation Model Training for Ximalaya AI Cloud

IT Architects Alliance

Apr 17, 2023 · Artificial Intelligence

DeepSpeed Chat: An Open‑Source Framework for Scalable RLHF Training of ChatGPT‑Style Models

DeepSpeed Chat provides a fast, affordable, and scalable system for end‑to‑end RLHF training of ChatGPT‑style large language models, offering one‑click scripts, detailed performance benchmarks across GPU configurations, support for many model families, and a flexible API for custom RLHF pipelines.

ChatGPTDeepSpeedGPU training

0 likes · 14 min read

DeepSpeed Chat: An Open‑Source Framework for Scalable RLHF Training of ChatGPT‑Style Models

Code DAO

May 27, 2022 · Artificial Intelligence

Building an Image Classification Model with CNNs

This article explains how to train a convolutional neural network on a remote GPU for image classification, covering convolution, padding, activation, pooling, dropout, flattening, fully‑connected layers, dataset preparation, model definition, training, and prediction using TensorFlow/Keras.

CNNFood-101GPU training

0 likes · 13 min read

Building an Image Classification Model with CNNs

Meituan Technology Team

Mar 24, 2022 · Artificial Intelligence

Booster GPU Training Architecture for Recommendation Systems at Meituan: Design, Optimization, and Deployment

Meituan’s Booster architecture co‑designs algorithm and system to run TensorFlow recommendation training on multi‑GPU A100 servers, optimizing data fetching, embedding pipelines, custom kernels and communication fusion, delivering 2–4× cost‑performance over CPU, over threefold GPU throughput, and seamless deployment via a single‑line API.

Booster architectureGPU trainingTensorFlow

0 likes · 36 min read

Booster GPU Training Architecture for Recommendation Systems at Meituan: Design, Optimization, and Deployment