A Curated Tour of Mamba Papers: 25 Cutting‑Edge State‑Space Model Innovations
This article presents a GitHub‑hosted collection of 25 recent research papers on Mamba and its variants, summarizing each work’s core contributions across sequence modeling, vision, medical imaging, graph analysis, and multimodal tasks, and highlighting their performance gains over prior methods.
GitHub repository: https://github.com/yyyujintang/Awesome-Mamba-Papers
Key Papers on Mamba and Its Variants
Mamba: Linear‑Time Sequence Modeling with Selective State Spaces – Introduces the Mamba architecture, which uses a sparse state‑transition matrix to update and propagate states selectively. This enables linear‑time processing of arbitrarily long sequences while preserving long‑range dependencies. Demonstrates strong results on NLP and computer‑vision benchmarks.
MoE‑Mamba: Efficient Selective State Space Models with Mixture of Experts – Combines Mixture‑of‑Experts (MoE) multi‑head attention with Mamba, enhancing the model’s ability to select relevant states. The hybrid achieves higher expressiveness and better generalization than vanilla Mamba on several sequence‑modeling tasks.
U‑Mamba: Enhancing Long‑Range Dependency for Biomedical Image Segmentation – Embeds a Mamba module into a U‑shaped network for biomedical image segmentation. The selective state‑space component strengthens long‑range context, yielding higher segmentation accuracy and robustness compared with traditional U‑Nets.
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model – Employs two Mamba modules that process image patches left‑to‑right and right‑to‑left, achieving bidirectional context aggregation. Matches or exceeds Transformer‑based models on classification, detection, and segmentation while using less compute and memory.
VMamba: Visual State Space Model – Uses a CNN encoder to flatten an image into a 1‑D sequence, then a Mamba decoder to generate outputs for tasks such as image classification, generation, and restoration. Provides efficiency and flexibility across visual tasks.
SegMamba: Long‑Range Sequential Modeling Mamba for 3D Medical Image Segmentation – Converts 3‑D volumes into 1‑D sequences and applies Mamba for segmentation, effectively capturing long‑range spatial dependencies and achieving superior performance on multiple 3‑D medical datasets.
MambaByte: Token‑Free Selective State Space Model – Operates directly on byte‑level inputs, eliminating tokenization overhead. A byte‑level Mamba demonstrates strong generation capabilities for text, code, and images.
Vivim: Video Vision Mamba for Medical Video Object Segmentation – Transforms video frames into a 1‑D sequence and applies Mamba to segment objects, handling dynamic changes and occlusions. Achieves state‑of‑the‑art results on medical video segmentation benchmarks.
MambaMorph: Mamba‑Based Backbone with Contrastive Feature Learning for Deformable MR‑CT Registration – Uses Mamba as a feature extractor for MR and CT images, coupled with a contrastive loss to align modalities despite intensity differences and deformations, improving registration accuracy and efficiency.
LOCOST: State‑Space Models for Long Document Abstractive Summarization – Employs Mamba as an encoder for lengthy documents and a Transformer decoder for summarization, reducing redundancy and enhancing coherence, outperforming existing long‑document summarizers.
Graph‑Mamba: Towards Long‑Range Graph Sequence Modeling with Selective State Spaces – Maps graph structures to 1‑D sequences and applies Mamba to capture long‑range dependencies and dynamic changes, achieving strong performance on graph‑sequence tasks.
VM‑UNet: Vision Mamba UNet for Medical Image Segmentation – Combines a Vision Mamba encoder with a U‑Net decoder, leveraging long‑range context to improve segmentation precision on various medical imaging datasets.
Swin‑UMamba: Mamba‑Based UNet with ImageNet Pre‑training – Introduces a Mamba‑based UNet pretrained on ImageNet, demonstrating that large‑scale pre‑training benefits medical image segmentation performance.
nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model – Integrates CNNs with state‑space models by inserting SSM blocks into residual layers, enabling both local feature extraction and modeling of complex dependencies.
U‑shaped Vision Mamba for Single Image Dehazing (UVM‑Net) – Proposes a lightweight U‑shaped Vision Mamba architecture tailored for efficient single‑image dehazing.
Can Mamba Learn How to Learn? A Comparative Study on In‑Context Learning Tasks – Benchmarks Mamba‑based SSMs on in‑context learning scenarios and compares them with Transformer models, analyzing strengths and limitations.
Mamba‑ND: Selective State Space Modeling for Multi‑Dimensional Data – Extends the Mamba design to handle arbitrary‑dimensional inputs, providing a unified framework for multi‑dimensional data modeling.
FD‑Vision Mamba for Endoscopic Exposure Correction (FDVM‑Net) – Introduces a frequency‑domain Vision Mamba that reconstructs endoscopic images in the frequency domain to correct exposure artifacts.
Semi‑Mamba‑UNet: Pixel‑Level Contrastive Cross‑Supervised Visual Mamba‑based UNet for Semi‑Supervised Medical Image Segmentation – Merges a visual Mamba UNet with a semi‑supervised learning framework, leveraging contrastive pixel‑level supervision.
P‑Mamba: Perona‑Malik Diffusion Combined with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation – Combines diffusion‑based smoothing with Mamba to achieve fast and accurate segmentation of pediatric echocardiogram left ventricles.
Graph Mamba: Learning on Graphs with State Space Models – Incorporates selective SSMs into the GraphGPS framework, enabling effective learning on graph‑structured data.
Hierarchical State Space Models for Continuous Sequence‑to‑Sequence Modeling (HiSS) – Proposes a hierarchical state‑space architecture for continuous sequence prediction, advancing the state of the art in seq2seq tasks.
PointMamba: A Simple State Space Model for Point Cloud Analysis – Presents a lightweight SSM tailored for point‑cloud processing, achieving competitive results with minimal complexity.
Weak‑Mamba‑UNet: Visual Mamba Enhances CNN and ViT Backbones for Scribble‑Based Medical Image Segmentation – Shows that integrating visual Mamba modules improves both CNN and Vision‑Transformer backbones under scribble‑supervised segmentation.
Pan‑Mamba: Effective Pan‑Sharpening with State Space Model – Introduces a state‑space‑based method for high‑quality pan‑sharpening of remote‑sensing imagery.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
