Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Sep 26, 2025 · Artificial Intelligence

Crack Large-Model Interviews: Master Positional Encoding, Residuals, LayerNorm & FFN

Preparing for large-model interview? This guide reveals why interviewers probe seemingly minor components—positional encoding, residual connections, layer normalization, and feed-forward networks—explains each technique's purpose, variants, and how to answer confidently, plus practical tips and a learning roadmap to boost your chances.

Artificial IntelligenceFFNInterview Tips
0 likes · 8 min read
Crack Large-Model Interviews: Master Positional Encoding, Residuals, LayerNorm & FFN
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Jul 25, 2024 · Artificial Intelligence

How Transformers Work: From Tensor Basics to GPU Performance Analysis

This article provides a comprehensive, engineer‑focused breakdown of transformer architecture—including tensor fundamentals, matrix multiplication, GPU theoretical compute, attention and FFN mechanics, quantitative parameter and FLOP analysis, performance metrics like MFU, parallelism strategies, variant optimizations, and practical exercise questions—offering clear insight into large‑model efficiency and scaling.

AttentionFFNGPU Performance
0 likes · 33 min read
How Transformers Work: From Tensor Basics to GPU Performance Analysis