Tag

GPU performance

0 views collected around this technical thread.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 24, 2024 · Artificial Intelligence

Evolution and Challenges of AI Infrastructure: Scaling Large Models on Cloud GPUs

In this talk from the 2024 China Generative AI Conference, Li Peng outlines the escalating computational demands of large‑model training and inference, identifies power, memory and communication walls, and presents Alibaba Cloud’s DeepGPU solutions and best‑practice strategies for scaling AI workloads on cloud GPUs.

AI infrastructureDeepGPUGPU performance
0 likes · 13 min read
Evolution and Challenges of AI Infrastructure: Scaling Large Models on Cloud GPUs
Baidu Tech Salon
Baidu Tech Salon
Sep 20, 2023 · Artificial Intelligence

Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis

In a live session, NVIDIA senior deep‑learning solutions architect Zhai Jian demonstrates how to use Nsight Systems and Nsight Compute to analyze a simple neural‑network training workload, accelerate BERT with mixed precision, and examine matrix‑transpose kernels, with registration via QR code and a detailed event schedule.

AI toolsBERTGPU performance
0 likes · 2 min read
Live Session: Introduction to NVIDIA Nsight Systems and Compute for AI Performance Analysis
DaTaobao Tech
DaTaobao Tech
Sep 7, 2022 · Artificial Intelligence

Online Deep Learning (ODL) Model Optimization for Real‑Time Recommendation

The team enhanced real‑time recommendation by redesigning TensorFlow graphs—using constant‑folding, a custom CallGraphOP cache, a simplified dense layer, and CUDA‑Graph compatibility—boosting single‑machine throughput ~40%, raising GPU utilization from 30% to 43%, cutting latency and saving roughly 30% of hardware resources.

CUDA GraphGPU performanceOnline Deep Learning
0 likes · 11 min read
Online Deep Learning (ODL) Model Optimization for Real‑Time Recommendation