Data Party THU
May 16, 2026 · Artificial Intelligence
SubQ Beats Transformers: 12‑Million‑Token Context Model at Only 5% of Opus Cost
The article analyzes SubQ, a new LLM architecture using Subquadratic Sparse Attention (SSA) to achieve a 12‑million‑token context window with linear compute scaling, delivering up to 52× speedup and costing just 5% of Opus while matching dense‑attention performance on long‑context benchmarks.
BenchmarkSSASubQ
0 likes · 14 min read
