How MaIR Advances Image Restoration with a Locality‑Preserving Mamba Architecture
The article presents MaIR, a Mamba‑based image restoration model that preserves locality and continuity, detailing its architecture, scanning strategies, loss functions, experimental results on super‑resolution and denoising, and an ablation study, while providing links to the arXiv paper and GitHub source code.
Paper title: MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration
Paper link: https://arxiv.org/pdf/2412.20066
Source code: https://github.com/XLearning-SCU/2025-CVPR-MaIR
Method
Scanning strategies: four visual Mamba scanning patterns are compared (horizontal, vertical, diagonal, and reverse‑diagonal).
Overall architecture: three linear stages – shallow feature extraction (3×3 convolution), deep feature extraction (multiple RMG blocks), and reconstruction. For super‑resolution the reconstruction layer uses a 3×3 convolution followed by pixel‑shuffle; for denoising it uses a 3×3 convolution with a residual connection. Each RMG block contains several RMB+Conv modules; RMB is a transformer‑style block whose core is a VMM (dual‑branch Mamba). The VMM embeds the proposed MaIRM module.
MaIRM module: (a) NSS (Nested S‑shaped) flattens 2‑D features into four 1‑D sequences and scans them in four directions using a nested S‑shaped order; (b) SSO (Self‑Similarity Operator) captures long‑range dependencies across the sequences; (c) SSA (Sequence‑wise Spatial Aggregation) aggregates the four sequences by pooling, channel shuffle, group‑convolution unshuffle, chunked attention weighting, and weighted summation to produce the final output.
Loss functions: L1 loss for super‑resolution tasks; Charbonnier loss (a smooth L1 variant) for denoising tasks.
Experiments
Quantitative evaluation
Super‑resolution: reported PSNR/SSIM improvements on standard benchmarks; visual comparison images illustrate sharper textures.
Denoising: quantitative gains across multiple noise levels; visual results show reduced artifacts.
Quality assessment
Super‑resolution visual quality: subjective evaluation demonstrates clearer edges and better texture fidelity.
Ablation study
NSS: removing the nested S‑shaped scanning reduces performance, confirming its role in preserving locality.
SSA: disabling sequence‑wise spatial aggregation harms the model’s ability to fuse long‑range information.
Bandwidth: varying the channel bandwidth of the VMM shows a trade‑off between accuracy and computational cost.
Summary and Insights
Refining scanning strategies and introducing dedicated modules (NSS, SSO, SSA) in Mamba‑based image restoration models effectively preserves locality and continuity, leading to higher restoration quality.
Compared with the earlier MambaIR model, MaIR achieves superior PSNR/SSIM on both super‑resolution and denoising benchmarks, as illustrated in the comparative figure.
Reference
@inproceedings{MaIR,
title={MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration},
author={Li, Boyun and Zhao, Haiyu and Wang, Wenxin and Hu, Peng and Gou, Yuanbiao and Peng, Xi},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
year={2025},
address={Nashville, TN},
month={jun}
}How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
