AI Frontier Lectures
Mar 31, 2025 · Artificial Intelligence
How Anthropic’s Path Tracing Reveals the Inner Workings of Claude 3.5 Haiku
Anthropic’s recent paper introduces a path‑tracing technique that uses cross‑layer transcoders and attribution graphs to sparsely visualize and analyze the decision‑making process of the Claude 3.5 Haiku large language model, demonstrating Pareto‑optimal improvements and a four‑stage reverse‑engineering framework while acknowledging current limitations.
AnthropicAttribution GraphClaude 3.5
0 likes · 14 min read
