Artificial Intelligence 22 min read

How to Implement SRCNN for Image Super‑Resolution in PyTorch

This article walks through a complete PyTorch implementation of the SRCNN model for image super‑resolution, covering dataset preparation, patch extraction, model architecture, training on a GTX 770 GPU for 2500 epochs, PSNR evaluation, and visual comparisons with bicubic up‑sampling.

Code DAO

Jun 7, 2022

How to Implement SRCNN for Image Super‑Resolution in PyTorch

Overview

The guide demonstrates a full PyTorch implementation of the SRCNN (Super‑Resolution Convolutional Neural Network) model for image super‑resolution tasks, including data preparation, model definition, training, validation, and result visualization.

Model Architecture and Differences

Compared with the original paper, this implementation adds padding so that the output image has the same spatial dimensions as the input, simplifying comparison. The original Caffe model has 8,032 parameters, whereas the PyTorch version contains over 20,000 parameters due to the added padding.

Optimizer Choice

The original SRCNN uses layer‑wise learning rates with SGD. For simplicity, the PyTorch version employs a single‑rate Adam optimizer for the entire network.

Dataset Preparation

Three datasets are used: T91 (training), Set5 and Set14 (validation and final testing). Image patches of size 32×32 are extracted from T91 with a stride of 14, yielding 22,227 patches. The patchify library creates the patches, and OpenCV saves both high‑resolution and low‑resolution (down‑sampled by 0.5 and up‑scaled with bicubic) versions.

Patch Extraction Code

from PIL import Image
from tqdm import tqdm
import matplotlib.pyplot as plt
import patchify
import numpy as np
import glob, os, cv2
SHOW_PATCHES = False
STRIDE = 14
SIZE = 32

def create_patches(input_paths, out_hr_path, out_lr_path):
    os.makedirs(out_hr_path, exist_ok=True)
    os.makedirs(out_lr_path, exist_ok=True)
    all_paths = []
    for input_path in input_paths:
        all_paths.extend(glob.glob(f"{input_path}/*"))
    print(f"Creating patches for {len(all_paths)} images")
    for image_path in tqdm(all_paths, total=len(all_paths)):
        image = Image.open(image_path)
        image_name = os.path.splitext(os.path.basename(image_path))[0]
        w, h = image.size
        patches = patchify.patchify(np.array(image), (32, 32, 3), STRIDE)
        for i in range(patches.shape[0]):
            for j in range(patches.shape[1]):
                patch = patches[i, j, 0]
                patch = cv2.cvtColor(patch, cv2.COLOR_RGB2BGR)
                cv2.imwrite(f"{out_hr_path}/{image_name}_{i}_{j}.png", patch)
                low_res = cv2.resize(patch, (int(w*0.5), int(h*0.5)), interpolation=cv2.INTER_CUBIC)
                high_res_up = cv2.resize(low_res, (w, h), interpolation=cv2.INTER_CUBIC)
                cv2.imwrite(f"{out_lr_path}/{image_name}_{i}_{j}.png", high_res_up)

Training Pipeline

Training runs on a GTX 770 GPU for roughly three days (≈2500 epochs, batch size 128). The training script logs loss and PSNR for both training and validation sets, saving model checkpoints every 100 epochs and the model state after each epoch. PSNR is computed with the following function:

import math, numpy as np, torch
from torchvision.utils import save_image

def psnr(label, outputs, max_val=1.0):
    label = label.cpu().detach().numpy()
    outputs = outputs.cpu().detach().numpy()
    diff = outputs - label
    rmse = math.sqrt(np.mean(diff ** 2))
    if rmse == 0:
        return 100
    return 20 * math.log10(max_val / rmse)

After 2500 epochs the final training PSNR reaches 29.85 dB and validation PSNR 29.61 dB. Although the validation set is a combined Set5 + Set14 collection, the values are slightly lower than those reported in the original paper.

Result Visualization

Loss and PSNR curves are saved as PNG files. Sample reconstructed images from the final epoch are compared against bicubic up‑sampling and the ground‑truth high‑resolution images. Across various scenes (comic, butterfly wing, zebra), SRCNN consistently produces sharper details than bicubic, though improvement varies with image content.

Code Organization

The project follows a clear directory layout:

├── input
│   ├── Set14
│   ├── Set5
│   ├── T91
│   ├── t91_hr_patches
│   ├── t91_lr_patches
│   ├── test_bicubic_rgb_2x
│   └── test_hr
├── outputs
│   ├── valid_results
│   ├── loss.png
│   ├── model_ckpt.pth
│   ├── model.pth
│   └── psnr.png
├── src
│   ├── bicubic.py
│   ├── datasets.py
│   ├── patchify_image.py
│   ├── srcnn.py
│   ├── train.py
│   └── utils.py
└── NOTES.md

Key scripts: utils.py – PSNR calculation, plot saving, model checkpoint utilities. patchify_image.py – Generates high‑ and low‑resolution patches. bicubic.py – Prepares validation images by down‑sampling with bicubic interpolation. datasets.py – Defines SRCNNDataset and data loader helpers. srcnn.py – Implements the three‑layer convolutional network. train.py – Orchestrates training, validation, logging, and checkpointing.

Conclusion

The article provides a reproducible end‑to‑end pipeline for training SRCNN on standard super‑resolution benchmarks, demonstrates practical choices (padding, Adam optimizer), and presents quantitative (PSNR) and qualitative (visual) evidence that the learned model outperforms simple bicubic up‑sampling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

PyTorch PSNR Training image super-resolution Patchify SRCNN

Written by

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.