Building and Training a Fully Connected Neural Network for Fashion-MNIST Classification with PyTorch
This tutorial demonstrates how to download the Fashion‑MNIST dataset, build a four‑layer fully connected neural network with PyTorch, and train it using loss functions, Adam optimizer, learning‑rate strategies, and Dropout to achieve high‑accuracy multi‑class image classification.
This article guides readers through the complete process of creating a multi‑class image classifier using the Fashion‑MNIST dataset and PyTorch. It starts by downloading the dataset directly via code, applying tensor conversion and normalization to prepare the images for training.
Next, a four‑layer fully connected network is defined, with input size 784 (28×28 pixels) and hidden layers of 256, 128, and 64 neurons, ending with a 10‑class output layer. The model class is implemented in PyTorch as follows:
class Classifier(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 256)
self.fc2 = nn.Linear(256, 128)
self.fc3 = nn.Linear(128, 64)
self.fc4 = nn.Linear(64, 10)
self.dropout = nn.Dropout(p=0.2) # optional dropout layer
def forward(self, x):
x = x.view(x.shape[0], -1)
x = self.dropout(F.relu(self.fc1(x)))
x = self.dropout(F.relu(self.fc2(x)))
x = self.dropout(F.relu(self.fc3(x)))
x = F.log_softmax(self.fc4(x), dim=1)
return xThe training loop runs for 15 epochs, using the negative log‑likelihood loss (NLLLoss) and the Adam optimizer with a learning rate of 0.003. During each epoch the code computes training loss, evaluates on the test set without gradient tracking, records both training and validation losses, and prints progress.
# Instantiate model, loss, optimizer
model = Classifier()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)
epochs = 15
train_losses, test_losses = [], []
for e in range(epochs):
running_loss = 0.0
for images, labels in trainloader:
optimizer.zero_grad()
log_ps = model(images)
loss = criterion(log_ps, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
# Validation
test_loss, accuracy = 0.0, 0.0
with torch.no_grad():
model.eval()
for images, labels in testloader:
log_ps = model(images)
test_loss += criterion(log_ps, labels)
ps = torch.exp(log_ps)
top_p, top_class = ps.topk(1, dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor))
model.train()
train_losses.append(running_loss / len(trainloader))
test_losses.append(test_loss / len(testloader))The article also explains key concepts such as loss functions (MSE, cross‑entropy, log loss), gradient‑descent variants (batch, stochastic, mini‑batch), learning‑rate strategies (fixed, decay, momentum, adaptive methods like Adam), and the back‑propagation algorithm.
To combat over‑fitting, Dropout is introduced and integrated into the network architecture, randomly deactivating 20% of neurons during training. After applying Dropout, training and validation losses both decrease steadily, and the model reaches an accuracy close to 99% on the test set.
Finally, the article visualizes loss curves using Matplotlib and encourages readers to explore further deep‑learning topics.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.