State Space Model — 7 Technical Articles

Oct 31, 2025 · Artificial Intelligence

Beyond Transformers: Exploring Post‑Transformer Architectures for Long‑Sequence Modeling

This article reviews the emerging post‑Transformer research landscape, covering linear state‑space models, efficient attention approximations, MLP/conv/RNN hybrids, sparse and causal attention mechanisms, and outlines future trends that may complement or replace the classic Transformer architecture for handling ultra‑long sequences.

AIEfficient AttentionSparse Attention

0 likes · 17 min read

Beyond Transformers: Exploring Post‑Transformer Architectures for Long‑Sequence Modeling

Data Party THU

Sep 20, 2025 · Artificial Intelligence

How Mamba-Adaptor Revives State‑Space Models for Vision Tasks

The Mamba-Adaptor introduces a dual‑module adapter that overcomes causal computation limits, long‑range memory decay, and spatial structure loss in state‑space models, delivering state‑of‑the‑art results on ImageNet, COCO, and various downstream visual tasks with minimal overhead.

COCOImageNetMamba-Adaptor

0 likes · 8 min read

How Mamba-Adaptor Revives State‑Space Models for Vision Tasks

Data Party THU

Aug 5, 2025 · Artificial Intelligence

Why State Space Models May Outperform Transformers: A Deep Dive

The article provides a comprehensive technical analysis of state space models (SSM) versus Transformers, covering their core mechanisms, three essential design factors, training efficiency, scaling behavior, tokenization debates, and experimental evidence that highlights the trade‑offs and potential advantages of SSMs in modern AI systems.

MambaState Space ModelTransformer

0 likes · 21 min read

Why State Space Models May Outperform Transformers: A Deep Dive

AIWalker

Jun 24, 2025 · Artificial Intelligence

Mamba-Adaptor Merges Adaptor‑T and Adaptor‑S to Revolutionize Vision Tasks with State‑of‑the‑Art Benchmarks

The paper introduces Mamba-Adaptor, a plug‑and‑play module combining Adaptor‑T and Adaptor‑S to overcome causal computation, long‑range forgetting, and spatial modeling limits of visual Mamba models, delivering top‑ranked results on ImageNet and COCO across multiple downstream tasks.

AdaptorMambaState Space Model

0 likes · 25 min read

Mamba-Adaptor Merges Adaptor‑T and Adaptor‑S to Revolutionize Vision Tasks with State‑of‑the‑Art Benchmarks

AI Frontier Lectures

May 15, 2025 · Artificial Intelligence

DefMamba: How Deformable Scanning Boosts Vision State‑Space Models

DefMamba introduces a deformable visual state‑space model that dynamically adjusts scanning paths and reference points, preserving spatial structure and improving feature capture, achieving state‑of‑the‑art results on ImageNet classification, COCO detection, and ADE20K segmentation while reducing computational cost.

DefMambaDeformable ScanningState Space Model

0 likes · 23 min read

DefMamba: How Deformable Scanning Boosts Vision State‑Space Models

AI Frontier Lectures

Mar 14, 2025 · Artificial Intelligence

Do Vision Models Really Need Mamba? A Deep Dive into MambaOut

This article critically examines the MambaOut paper, analyzing whether state‑space‑based Mamba token mixers are necessary for vision tasks, presenting two hypotheses, describing the construction of MambaOut models without SSM, and reporting extensive ImageNet, COCO and ADE20K experiments that reveal when Mamba is beneficial.

MambaState Space ModelToken Mixer

0 likes · 17 min read

Do Vision Models Really Need Mamba? A Deep Dive into MambaOut

AIWalker

Mar 10, 2025 · Artificial Intelligence

HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods

HSR-Mamba introduces a contextual spatial‑spectral state‑space model that tackles Mamba's limitations in hyperspectral image super‑resolution through a local partition mechanism and a global spectral rearrangement strategy, achieving significantly higher PSNR, SSIM and SAM scores than existing approaches while using fewer parameters and FLOPs.

Dual strategyHSI super-resolutionMamba

0 likes · 25 min read

HSR-Mamba Solves Mamba’s HSISR Issue with Dual Strategies, Beats Prior Methods