Artificial Intelligence 7 min read

How COMI Achieves 32× Compression and Boosts Performance by 25 Points

The COMI framework introduces a marginal information gain metric and a coarse‑to‑fine two‑stage compression strategy that preserves relevance and diversity, enabling 32× context reduction while improving Exact Match on NaturalQuestions by nearly 25 points and more than doubling inference speed.

Machine Learning Algorithms & Natural Language Processing

Feb 23, 2026

How COMI Achieves 32× Compression and Boosts Performance by 25 Points

Problem

When compressing long texts (e.g., 32 K tokens to 1 K), many existing methods keep highly similar tokens. The resulting redundancy creates “information internal competition”, confusing the model and causing a sharp performance drop.

Marginal Information Gain (MIG)

MIG quantifies the value of a token as the relevance of the token to the query minus the maximum similarity of that token to any token already selected:

MIG = relevance to query – max similarity to other units

The metric rewards tokens that are both relevant and novel, while penalizing tokens that duplicate information already chosen.

Coarse‑to‑fine Adaptive Compression (COMI)

Stage 1 – Coarse‑grained group reallocation

The document is split into equal‑length segments. Instead of applying a uniform compression rate, COMI computes a segment‑level MIG and dynamically adjusts the compression budget per segment. Segments with high information density and low redundancy receive a looser compression rate, whereas sparse or highly repetitive segments are compressed more aggressively. This ensures that the limited budget is allocated to high‑value regions.

Stage 2 – Fine‑grained token fusion

Within each segment, tokens are weighted by their token‑level MIG. High‑MIG tokens dominate the weighted fusion, while low‑MIG (redundant) tokens are naturally diluted. This avoids the “information dilution” problem of simple averaging and preserves diverse, critical details.

Empirical Results

Downstream performance

Under a 32× compression ratio, COMI with Qwen2‑7B achieves an Exact Match of 49.15 on NaturalQuestions, nearly 25 points higher than the next best baseline. On NarrativeQA (32 K‑token inputs), COMI retains key reasoning nodes, demonstrating robustness in extreme compression scenarios.

For a 256 K‑context model (Qwen3‑4B), COMI after 32× compression reaches an F1 of 28.89 on NaturalQuestions, far above the 16.90 obtained when feeding the full context.

Efficiency

Inference speed more than doubles under 32× compression. The compression step adds only lightweight overhead (e.g., 2.76 s compression and 0.50 s generation on NarrativeQA), making the approach suitable for industrial deployment.

Conclusion

By upgrading the compression objective from “retain relevant fragments” to “retain relevant and diverse information,” the MIG metric and the coarse‑to‑fine strategy overcome the performance bottleneck of high‑compression scenarios, delivering compact representations that remain rich in information for large‑model inference.

Paper: https://arxiv.org/abs/2602.01719

Code: https://github.com/Twilightaaa/COMI

NLP Context Compression Long-Context Retrieval Marginal Information Gain

Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Problem

Marginal Information Gain (MIG)

Coarse‑to‑fine Adaptive Compression (COMI)

Stage 1 – Coarse‑grained group reallocation

Stage 2 – Fine‑grained token fusion

Empirical Results

Downstream performance

Efficiency

Conclusion

Machine Learning Algorithms & Natural Language Processing

How this landed with the community

Was this worth your time?

0 Comments

Stage 1 – Coarse‑grained group reallocation

Stage 2 – Fine‑grained token fusion