Dimension Stretching Lets One CNN Tackle Diverse Degradations in Image Super‑Resolution
Recent advances in CNN‑based single image super‑resolution assume bicubic down‑sampling, limiting real‑world performance; this paper introduces a dimension‑stretching strategy that feeds blur kernels and noise levels into a CNN, enabling a single model (SRMD) to efficiently handle multiple, even spatially varying, degradation types with strong quantitative and visual results.
1. Introduction
Single image super‑resolution (SISR) aims to recover a high‑resolution (HR) image from a low‑resolution (LR) input. Conventional CNN‑based SISR methods assume that LR images are generated by bicubic down‑sampling of HR images, which leads to poor results when real degradations differ from this assumption and prevents a single model from handling multiple degradation types.
Recent works have shown that accurate modeling of blur kernels is crucial for SISR, yet few CNN approaches incorporate kernel information. This motivates the problem of designing a non‑blind SISR model capable of handling various degradation scenarios.
2. Method
We first analyze SISR under a maximum‑a‑posteriori (MAP) framework, highlighting the need for data‑fidelity and prior terms. The MAP formulation can be expressed as:
where the likelihood term models the degradation process and the prior term encodes image priors. To enable a CNN to handle multiple degradations, we propose feeding the blur kernel \(k\) and noise level \(\sigma\) as additional inputs.
Estimating the HR image must satisfy both the degradation model and image priors.
For non‑blind SISR, the solution depends on the LR image, blur kernel, noise level, and a weighting parameter.
Because the dimensions of LR images, kernels, and noise levels differ, we introduce a dimension‑stretching strategy. The blur kernel is vectorized and reduced by PCA, then concatenated with the noise level to form a \(t+1\)‑dimensional vector \(v\). This vector is stretched into a tensor of shape \(t\) called a Degradation Map, where each channel contains the same value.
We concatenate the Degradation Map with the LR image and feed them into a CNN. Using the efficient ESPCN architecture with added Batch Normalization, we build the SRMD network (12 convolutional layers, 128 channels each).
3. Experiments
During training, SRMD is exposed to isotropic and anisotropic Gaussian blur kernels, Gaussian noise levels in [0, 75], and bicubic down‑sampling. The model can also be extended to other down‑sampling operators and degradation models.
In testing, SRMD achieves competitive PSNR/SSIM scores under bicubic degradation and demonstrates a speed advantage, processing a 512×512 LR image in 0.084 s on a Titan Xp GPU—about half the time of VDSR.
Results on various degradation types show that SRMD maintains strong performance. Visual examples illustrate that SRMD can handle spatially non‑uniform degradation maps and produce superior HR reconstructions on real images compared to other methods.
4. Conclusion
The main contributions are:
A simple, effective, and extensible SR model that handles bicubic as well as multiple and spatially non‑uniform degradations.
A dimension‑stretching strategy that allows convolutional networks to process inputs of differing dimensions, applicable to other tasks.
Demonstration that a model trained on synthetic data can robustly restore real‑world images with complex degradations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
