Tag

CUTLASS

0 views collected around this technical thread.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 22, 2023 · Artificial Intelligence

CUTLASS Extreme Performance Optimization and Its Application in Alibaba's Recommendation System

At the GTC conference, the talk presents Alibaba Cloud’s heterogeneous computing platform and introduces the Open Deep Learning API (ODLA), then details how CUTLASS‑based operator fusion dramatically accelerates attention and MLP layers in large‑scale recommendation models, achieving multi‑fold performance gains in production.

CUTLASSDeep LearningGPU computing
0 likes · 5 min read
CUTLASS Extreme Performance Optimization and Its Application in Alibaba's Recommendation System