Tag

Multi-GPU

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
Feb 17, 2025 · Artificial Intelligence

NorthStar Large‑Model Training Framework: Architecture, APIs, Pipeline and Multi‑GPU Strategies

The article introduces the NorthStar large‑model training framework developed by DeWu, detailing its background challenges, pipeline architecture, rich API support, multi‑GPU training modes, multi‑level embedding storage, hardware selection considerations, and a brief Q&A on data versus model parallelism.

AI FrameworkEmbedding StorageLarge Model Training
0 likes · 9 min read
NorthStar Large‑Model Training Framework: Architecture, APIs, Pipeline and Multi‑GPU Strategies
NetEase Media Technology Team
NetEase Media Technology Team
Aug 9, 2023 · Artificial Intelligence

GPU Model Inference Optimization Practices in NetEase News Recommendation System

The article outlines practical GPU inference optimization for NetEase’s news recommendation, covering model analysis with Netron, multi‑GPU parallelism, memory‑copy reduction, batch sizing, TensorRT conversion and tuning, custom plugins, and the GRPS serving framework to achieve significant latency and utilization gains.

GPU inferenceMulti-GPUPerformance
0 likes · 44 min read
GPU Model Inference Optimization Practices in NetEase News Recommendation System
Python Programming Learning Circle
Python Programming Learning Circle
Aug 23, 2021 · Artificial Intelligence

Efficient PyTorch Training Pipeline: Tips, Profiling, and Multi‑GPU Strategies

This article presents practical strategies for building high‑performance PyTorch training pipelines, covering bottleneck identification, efficient data loading, RAM‑based datasets, profiling tools, multi‑GPU training with DataParallel and DistributedDataParallel, custom loss implementation, and hardware‑vs‑software trade‑offs to accelerate deep‑learning workloads.

Custom LossDataLoaderMulti-GPU
0 likes · 13 min read
Efficient PyTorch Training Pipeline: Tips, Profiling, and Multi‑GPU Strategies
Liulishuo Tech Team
Liulishuo Tech Team
Mar 25, 2017 · Artificial Intelligence

Building a Student Model with TensorFlow: Deep Knowledge Tracing for Adaptive Learning

This article reviews how Liulishuo applied TensorFlow to implement a Deep Knowledge Tracing (DKT) student model for an adaptive learning system, covering the problem background, model architecture, TensorFlow implementation details, multi‑GPU training, and practical deployment considerations.

Deep Knowledge TracingMulti-GPURNN
0 likes · 12 min read
Building a Student Model with TensorFlow: Deep Knowledge Tracing for Adaptive Learning