Visualizing Convolutional Neural Networks: Methods and Practical Examples
This article explains why visualizing CNN models is crucial for understanding and debugging deep learning systems, outlines three main visualization approaches—basic architecture, activation‑based, and gradient‑based methods—and provides step‑by‑step Keras code examples, including model summary, filter visualization, occlusion mapping, saliency maps, and class activation maps.
One of the most debated topics in deep learning is how to interpret and understand a trained model, especially in high‑risk domains such as medical diagnosis where a model’s "black‑box" nature raises trust issues. The article uses a cancer detection CNN as a motivating example to illustrate the need for model transparency.
It then emphasizes the importance of visualizing CNNs, listing four key reasons for practitioners: understanding model mechanics, tuning hyper‑parameters, identifying and fixing failure modes, and communicating decisions to stakeholders.
An illustrative case study describes a historical experiment where a neural network was trained to detect camouflaged tanks, but it actually learned to distinguish sunny from overcast images, highlighting the necessity of visualization to uncover such hidden biases.
The article categorizes CNN visualization techniques into three groups:
Basic methods that display the overall architecture.
Activation‑based methods that decode the behavior of individual neurons or groups of neurons.
Gradient‑based methods that manipulate forward and backward gradients during training.
All examples are implemented with Keras and the keras‑vis library. Before running the code, the required packages must be installed.
1. Basic Methods
1.1 Plotting the model architecture can be done simply by printing the model summary:
model.summary()For a more visual representation, Keras utilities such as keras.utils.vis_utils can generate a diagram of the network.
1.2 Visualizing filters shows what each convolutional filter has learned. Example code:
top_layer = model.layers[0]
plt.imshow(top_layer.get_weights()[0][:, :, :, 0].squeeze(), cmap='gray')Low‑level filters often act as edge detectors, while deeper layers capture higher‑level concepts like objects or faces.
2. Activation‑Based Methods
2.1 Maximal activation visualizes the input pattern that maximally activates a specific neuron. Example:
from vis.visualization import visualize_activation
from vis.utils import utils
from keras import activations
layer_idx = utils.find_layer_idx(model, 'preds')
model.layers[layer_idx].activation = activations.linear
model = utils.apply_modifications(model)
filter_idx = 0
img = visualize_activation(model, layer_idx, filter_indices=filter_idx)
plt.imshow(img[..., 0])Iterating over all output classes reveals what each class neuron responds to.
2.2 Occlusion mapping systematically masks parts of the input image with a gray square and observes the change in class probability, indicating which regions the model relies on.
def iter_occlusion(image, size=8):
occlusion = np.full((size * 5, size * 5, 1), [0.5], np.float32)
occlusion_center = np.full((size, size, 1), [0.5], np.float32)
occlusion_padding = size * 2
image_padded = np.pad(image, (
(occlusion_padding, occlusion_padding),
(occlusion_padding, occlusion_padding),
(0, 0)
), 'constant', constant_values=0.0)
for y in range(occlusion_padding, image.shape[0] + occlusion_padding, size):
for x in range(occlusion_padding, image.shape[1] + occlusion_padding, size):
tmp = image_padded.copy()
tmp[y-occlusion_padding:y+occlusion_center.shape[0]+occlusion_padding,
x-occlusion_padding:x+occlusion_center.shape[1]+occlusion_padding] = occlusion
tmp[y:y+occlusion_center.shape[0], x:x+occlusion_center.shape[1]] = occlusion_center
yield x-occlusion_padding, y-occlusion_padding, tmp[occlusion_padding:-occlusion_padding, occlusion_padding:-occlusion_padding]
# Example loop over occlusions
for n, (x, y, img_float) in enumerate(iter_occlusion(data, size=occlusion_size)):
X = img_float.reshape(1, 28, 28, 1)
out = model.predict(X)
heatmap[y:y+occlusion_size, x:x+occlusion_size] = out[0][correct_class]
class_pixels[y:y+occlusion_size, x:x+occlusion_size] = np.argmax(out)Heatmaps generated this way highlight the image regions most influential for the predicted class.
3. Gradient‑Based Methods
3.1 Saliency maps compute the gradient of the output class with respect to the input image, visualizing pixels that most affect the prediction.
from vis.visualization import visualize_saliency
layer_idx = utils.find_layer_idx(model, 'preds')
model.layers[layer_idx].activation = activations.linear
model = utils.apply_modifications(model)
grads = visualize_saliency(model, layer_idx, filter_indices=class_idx, seed_input=val_x[idx])
plt.imshow(grads, cmap='jet')3.2 Class Activation Maps (CAM) use the activations of the last convolutional layer instead of gradients, preserving spatial information.
from vis.visualization import visualize_cam
for class_idx in np.arange(10):
idx = np.where(val_y[:, class_idx] == 1.)[0][0]
f, ax = plt.subplots(1, 4)
ax[0].imshow(val_x[idx][..., 0])
for i, modifier in enumerate([None, 'guided', 'relu']):
grads = visualize_cam(model, layer_idx, filter_indices=class_idx,
seed_input=val_x[idx], backprop_modifier=modifier)
mod_name = 'vanilla' if modifier is None else modifier
ax[i+1].set_title(mod_name)
ax[i+1].imshow(grads, cmap='jet')The article concludes that visualizing CNNs provides valuable insights for model debugging, performance improvement, and broader applications across domains.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.