Artificial Intelligence 25 min read

Understanding Convolution, Convolutional Neural Networks, and Their Implementation in Image Processing

This article explains the mathematical concept of 2‑D convolution, demonstrates its use for image filtering with examples such as blurring and Sobel edge detection, introduces artificial neural networks and back‑propagation, and details the design, training, and performance of convolutional neural networks for tasks like Sobel filter learning and MNIST digit recognition, including full Python code examples.

360 Smart Cloud
360 Smart Cloud
360 Smart Cloud
Understanding Convolution, Convolutional Neural Networks, and Their Implementation in Image Processing

Convolution in two dimensions combines two functions f(x,y) and g(x,y) to produce a new function c(x,y) by integrating the product of f(s,t) and g(x‑s,y‑t) over all s and t.

When applied to digital images, the continuous integral becomes a discrete sum, where a kernel F (e.g., a 3×3 matrix) slides over a grayscale image G, multiplying overlapping values and summing them to obtain the filtered image C.

Examples illustrate how a 3×3 averaging kernel produces a mild blur, while a Sobel operator detects edges; the effect of kernel size and normalization on output pixel ranges is discussed.

The article then introduces artificial neural networks (NN), describing the historical development of perceptrons, the back‑propagation algorithm, and the mathematical model of a neuron (weighted sum plus bias passed through an activation function).

It shows how a convolutional layer can be viewed as a NN layer with shared weights (the kernel) and no bias, and explains the architecture of a simple CNN that learns a Sobel filter by training on a single input‑output image pair.

Further, a more complex CNN for handwritten digit recognition on the MNIST dataset is presented: the network consists of reshaping, two convolution‑pooling blocks, a flatten layer, two fully‑connected layers of 1000 units each, and a linear output layer for ten classes.

Training details include using mean‑square error loss, stochastic gradient descent with momentum, learning‑rate decay, and ten epochs; the resulting model achieves about 96 % accuracy, with per‑class precision, recall and F1‑score reported.

Finally, the full Python implementation using Keras is provided, including data loading, model definition, training loop, and evaluation code, as well as a separate script that trains a single‑filter CNN to reproduce the Sobel operator on the Lena image.

import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Flatten, Reshape, AveragePooling2D, Convolution2D, Activation
from keras.utils.np_utils import to_categorical
from keras.utils.visualize_util import plot
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
from keras.callbacks import Callback
from keras.optimizers import SGD

class LossHistory(Callback):
    def __init__(self):
        Callback.__init__(self)
        self.losses = []
        self.accuracies = []
    def on_train_begin(self, logs=None):
        pass
    def on_batch_end(self, batch, logs=None):
        self.losses.append(logs.get('loss'))
        self.accuracies.append(logs.get('acc'))

history = LossHistory()

data = pd.read_csv("train.csv")
data = data.sample(n=10000, replace=False)
digits = data[data.columns.values[1:]].values
labels = data.label.values

train_digits, test_digits, train_labels, test_labels = train_test_split(digits, labels)
train_labels_one_hot = to_categorical(train_labels)
test_labels_one_hot = to_categorical(test_labels)

model = Sequential()
model.add(Reshape(target_shape=(1, 28, 28), input_shape=(784,)))
model.add(Convolution2D(nb_filter=32, nb_row=3, nb_col=3, dim_ordering="th", border_mode="same", bias=False, init="uniform"))
model.add(AveragePooling2D(pool_size=(2, 2), dim_ordering="th"))
model.add(Convolution2D(nb_filter=64, nb_row=3, nb_col=3, dim_ordering="th", border_mode="same", bias=False, init="uniform"))
model.add(AveragePooling2D(pool_size=(2, 2), dim_ordering="th"))
model.add(Flatten())
model.add(Dense(output_dim=1000, activation="sigmoid"))
model.add(Dense(output_dim=1000, activation="sigmoid"))
model.add(Dense(output_dim=10, activation="linear"))

with open("digits_model.json", "w") as f:
    f.write(model.to_json())
plot(model, to_file="digits_model.png", show_shapes=True)

opt = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss="mse", optimizer=opt, metrics=["accuracy"])
model.fit(train_digits, train_labels_one_hot, batch_size=32, nb_epoch=10, callbacks=[history])
model.save_weights("digits_model_weights.hdf5")

predict_labels = model.predict_classes(test_digits)
print(classification_report(test_labels, predict_labels))
print(accuracy_score(test_labels, predict_labels))
print(confusion_matrix(test_labels, predict_labels))
CNNPythondeep learningimage-processingNeural NetworksConvolution
360 Smart Cloud
Written by

360 Smart Cloud

Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.