Jun 20, 2025 · Artificial Intelligence

How GCA Achieves 1000× Length Generalization in Large Language Models

Ant Research introduces GCA, a causal retrieval‑based grouped cross‑attention mechanism that end‑to‑end learns to fetch relevant past chunks, dramatically reducing memory usage and achieving over 1000× length generalization on long‑context language modeling tasks, with near‑constant inference memory and linear training cost.

AI researchGrouped Cross AttentionLLM efficiency

0 likes · 11 min read

How GCA Achieves 1000× Length Generalization in Large Language Models