Tag

TorchRec

0 views collected around this technical thread.

DataFunSummit
DataFunSummit
Oct 5, 2024 · Artificial Intelligence

Optimizing TorchRec for Large‑Scale Recommendation Systems on PyTorch

This article details the performance‑focused optimizations applied to TorchRec, PyTorch's large‑scale recommendation system library, including CUDA graph capture, multithreaded kernel launches, pinned memory copies, and input‑distribution refinements that together achieve a 2.25× speedup on MLPerf DLRM‑DCNv2 across 16 DGX H100 nodes.

CUDA GraphGPU optimizationPyTorch
0 likes · 11 min read
Optimizing TorchRec for Large‑Scale Recommendation Systems on PyTorch
DataFunTalk
DataFunTalk
Apr 3, 2023 · Artificial Intelligence

Large‑Scale Recommendation System Training with TorchRec and Dynamic Embedding

This article explains how Tencent’s AI team leverages the PyTorch‑based TorchRec library and a custom dynamic embedding solution to train billion‑scale recommendation models efficiently, detailing the benefits of TorchRec, GPU embedding, optimized kernels, embedding partition strategies, experimental results, and practical deployment guidance.

Dynamic EmbeddingGPU EmbeddingLarge-Scale Training
0 likes · 15 min read
Large‑Scale Recommendation System Training with TorchRec and Dynamic Embedding