Tagged articles

CPU acceleration

3 articles · Page 1 of 1

Feb 2, 2024 · Cloud Native

Deploying a CPU‑Accelerated Stable Diffusion Service on Alibaba Cloud ACK

This guide shows how to deploy a cost‑effective, secure Stable Diffusion XL Turbo text‑to‑image service on an Alibaba Cloud ACK cluster using CPU‑only instances, Helm charts, and optional confidential TDX VM pools for protected inference.

Alibaba CloudCPU accelerationConfidential Computing

0 likes · 10 min read

Deploying a CPU‑Accelerated Stable Diffusion Service on Alibaba Cloud ACK

Meituan Technology Team

Jul 6, 2022 · Artificial Intelligence

Engineering Practices for Large-Scale Deep Learning Models in Meituan Takeaway Advertising

The article details Meituan's engineering journey from small DNNs to hundred‑gigabyte deep learning models for food‑delivery ads, analyzing online latency and offline efficiency challenges and presenting distributed storage, CPU/GPU acceleration, OpenVINO, TensorRT, CodeGen, and data‑pipeline optimizations that dramatically improve throughput, memory usage, and sample‑building speed.

CPU accelerationDeep LearningGPU Acceleration

0 likes · 45 min read

Engineering Practices for Large-Scale Deep Learning Models in Meituan Takeaway Advertising

Code DAO

Jan 15, 2022 · Artificial Intelligence

How Intel BF16 with IPEX and oneDNN Boosts PyTorch Performance

This article explains how Intel and Facebook's BF16 support, combined with the Intel Extension for PyTorch (IPEX) and oneDNN, automates type and layout conversions and adds graph‑fusion optimizations, delivering 1.4×‑4.3× inference and up to 2.4× training speedups on Xeon CPUs for models such as DLRM, BERT‑Large, and ResNext‑101‑32x4d.

BF16CPU accelerationDeep Learning

0 likes · 13 min read

How Intel BF16 with IPEX and oneDNN Boosts PyTorch Performance