White Screen Detection with TensorFlow: Data Preparation, Model Building, and Deployment
The article details a TensorFlow pipeline for detecting white‑screen screenshots in a WebView, covering data preparation from labeled image folders, a CNN architecture with preprocessing, training and validation steps, model saving, inference usage, and strategies to mitigate over‑fitting.
Background: After entering a WebView, the client captures a screenshot, uploads it to OSS, and logs the event. In Flink, the logs are consumed, images are classified as white screen or not, and stored.
Initial heuristic based on pixel color proportion caused many false positives, so a machine‑learning model was adopted.
Data and environment: Python 3.9, TensorFlow 2.6.0. Over 2200 labeled images are organized into folders (white, white_loading, white_error, network_error, not_white).
Dataset loading:
import os, pathlib, tensorflow as tf
data_dir = pathlib.Path(os.path.dirname(__file__) + '/../train_data')
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(IMAGE_HEIGHT, IMAGE_WIDTH),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(IMAGE_HEIGHT, IMAGE_WIDTH),
batch_size=batch_size)Class names are saved for later use.
class_names = train_ds.class_names
save_data_to_file(list2LineData(class_names), 'white_screen_model/labels.txt')Performance optimizations: cache, shuffle, prefetch.
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)Image repair function for corrupted files:
from PIL import Image, ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
def repairImg(imgPath):
try:
img = Image.open(imgPath)
img = img.resize((IMAGE_WIDTH, IMAGE_HEIGHT), Image.ANTIALIAS)
img.save(imgPath)
return img
except:
print("repairImg error:", imgPath)Data normalization using Rescaling layer (1/255).
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3))Model architecture (CNN with three Conv2D‑MaxPooling blocks, flatten, dense layers):
model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255, input_shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3)),
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(num_classes)
])Compilation with Adam optimizer and SparseCategoricalCrossentropy loss.
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])Training for 15 epochs and saving the model.
epochs = 15
history = model.fit(train_ds, validation_data=val_ds, epochs=epochs)
model.save('saved_model/white_screen_model')Training curves show decreasing training loss but rising validation loss, indicating over‑fitting. Remedies include data augmentation, weight regularization, reducing model complexity, and dropout.
Inference example: load the saved model, preprocess an image, and obtain the predicted class and confidence.
class_names = read_file_line(pathlib.Path('./saved_model/white_screen_model/labels.txt'))
model = tf.keras.models.load_model('./saved_model/white_screen_model')
img_path = tf.keras.utils.get_file(origin=img_url)
img = keras.preprocessing.image.load_img(img_path, target_size=(IMAGE_HEIGHT, IMAGE_WIDTH))
img_array = keras.preprocessing.image.img_to_array(img)
img_array = tf.expand_dims(img_array, 0)
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
label = class_names[np.argmax(score)]
confidence = 100 * np.max(score)The article concludes with a brief overview of the white‑screen monitoring platform and references.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.