Tagged articles
1 articles
Page 1 of 1
AI Frontier Lectures
AI Frontier Lectures
Apr 12, 2025 · Artificial Intelligence

How ByteDance Scales Attn/MoE: Cost Models, Mesh Communication, and Network Hacks

The article analyzes ByteDance's MegaScale‑Infer paper, detailing micro‑batching, M:N Attn‑MoE ratios, cost‑driven constraint search, communication redesign with Mesh All‑2‑All, network latency challenges, and innovative NIC and routing solutions for large‑scale mixture‑of‑experts inference.

AI inferenceByteDanceCost Optimization
0 likes · 7 min read
How ByteDance Scales Attn/MoE: Cost Models, Mesh Communication, and Network Hacks