Artificial Intelligence 13 min read

TensorFlow MNIST Tutorial: Environment Setup, Softmax Regression, and CNN Implementation

This beginner‑friendly TensorFlow tutorial by Chen Yidong walks readers through Windows environment setup, explains TensorFlow’s graph‑execution model, and demonstrates both softmax linear regression and a deep convolutional neural network for MNIST, while also covering utility scripts, TensorBoard visualization, and CPU/GPU or multi‑GPU deployment.

Tencent Cloud Developer

Mar 13, 2018

TensorFlow MNIST Tutorial: Environment Setup, Softmax Regression, and CNN Implementation

This article, authored by Chen Yidong, a Tencent mobile client development engineer, provides a comprehensive beginner-friendly guide to machine learning using TensorFlow. It covers the entire workflow from environment preparation to running MNIST demo programs, including both softmax regression and deep convolutional neural network (CNN) models.

Content Outline

Environment Setup

Understanding TensorFlow Execution Mechanism

MNIST Softmax Linear Regression

MNIST Deep CNN

Tool Utilities

CPU, GPU, and Multi‑GPU Usage

1. Environment Setup (Windows)

Install Anaconda (e.g., Anaconda3 4.2) to manage Python packages and isolate environments.

Create a TensorFlow isolated environment via the Anaconda Prompt.

Install TensorFlow:

pip install tensorflow  # install via package manager
pip install tensorflow‑cpu‑1.2.1‑cp35‑cp35m‑win_amd64.whl  # install CPU version from a .whl file

For GPU support, ensure compatible CUDA and cuDNN versions are installed (e.g., CUDA 8.1 and cuDNN 6). Add the cuDNN bin directory to the system PATH.

2. Understanding TensorFlow Execution Mechanism

TensorFlow uses a symbolic Tensor as a handle to operation outputs. Computation is defined as a graph, which is executed within a Session to obtain concrete values.

3. MNIST Softmax Linear Regression

The MNIST dataset consists of 28×28 grayscale images of handwritten digits (0‑9) and their corresponding labels.

Softmax regression models the probability of each digit by applying a linear transformation followed by a softmax normalization:

logits = tf.matmul(X, W) + b
probabilities = tf.nn.softmax(logits)

The loss function is the cross‑entropy between predicted probabilities and true labels.

4. MNIST Deep Convolutional Neural Network (CNN)

The CNN architecture adds convolution, pooling, dropout, and fully‑connected layers to capture hierarchical features:

Reshape: [batch, 784] → [batch, 28, 28, 1] Conv2D: learnable filters (e.g., 32 kernels of size 5×5)

Pooling: max‑pooling reduces spatial dimensions

Dropout: randomly disables neurons to prevent over‑fitting

Fully Connected (FC): dense layers ending with a softmax output

Typical data flow:

Input → Conv → Pool → Conv → Pool → FC → FC → Softmax

Training uses the Adam optimizer (or alternatives such as GradientDescentOptimizer, RMSPropOptimizer, etc.) to minimize the cross‑entropy loss.

5. Tool Utilities

A custom tool.py wraps common TensorFlow operations for easier reuse. Open‑source alternatives include TensorLayer, Keras, and tflearn.

Example checkpoint handling:

# checkpoint
saver = tf.train.Saver(max_to_keep=3, write_version=2)
model_file = tf.train.latest_checkpoint(FLAGS.log_dir)
if model_file:
    saver.restore(sess, model_file)
# npz format
tools.load_and_assign_npz_dict(name=FLAGS.log_dir + '/model.npz', sess=sess)

TensorBoard visualization can be launched with: tensorboard --logdir=your-log-path This starts a web server (default http://localhost:6006) to display graphs, loss curves, and accuracy trends.

6. CPU, GPU, and Multi‑GPU

TensorFlow defaults to using all available CPUs ( /cpu:0) and the first GPU ( /gpu:0). Specific devices can be selected via tf.device or environment variables (e.g., CUDA_VISIBLE_DEVICES).

Multi‑GPU training requires manual aggregation of gradients and losses; TensorFlow provides examples, but the process is more involved than in frameworks like Caffe.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

CNN machine learning Python TensorFlow GPU MNIST

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.