Artificial Intelligence 9 min read

FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining

FireEye is an AI‑driven automated UI testing framework that ingests simulated and real screenshots, preprocesses images and OCR text, and employs a CNN for page anomalies, an SSD detector for control anomalies, and an LSTM‑based classifier for text anomalies, with Jenkins‑triggered retraining, cloud model storage, and API serving, aiming to simplify testing and enable future AutoML enhancements.

Xianyu Technology

Nov 2, 2018

FireEye AI-Powered Automated Testing Framework: Architecture, Model Selection, and Retraining

FireEye is an AI‑driven toolset that automates UI testing, reduces the barrier to using AI models, and allows model updates without script changes.

Usage effect : three anomaly types – page blank/large error image, control anomalies (loading, error HUD), and text anomalies – are detected and visualized (images omitted).

Engineering structure : training data comes from simulated samples, real user screenshots collected via Xianyu portal, and manually labeled re‑training data. Data are converted to JPG/PNG/TXT, uploaded to OSS, and a Jenkins job triggers re‑training scripts; the new model is stored in the cloud and served via API.

Model selection :

Page anomaly – simple visual pattern, classified with a CNN.

Control anomaly – multiple objects per page, detected with an SSD object detector.

Text anomaly – OCR‑extracted lines fed to an LSTM (RNN) to judge semantic correctness.

Data preprocessing (Python) :

maxlen = 0  # sentence max length
word_freqs = collections.Counter()  # word frequencies
num_recs = 0  # sample count
with open('./train.txt', 'r+') as f:
    for line in f:
        label, sentence = line.strip().split("\t")
        words = nltk.word_tokenize(sentence.lower())
        if len(words) > maxlen:
            maxlen = len(words)
        for word in words:
            word_freqs[word] += 1
        num_recs += 1

Two lookup tables word2index and index2word are built, PAD=0, UNK=1, and sentences are padded to MAX_SENTENCE_LENGTH. The data are split with train_test_split and fed to a Keras model:

model = Sequential()
model.add(Embedding(vocab_size, EMBEDDING_SIZE, input_length=MAX_SENTENCE_LENGTH))
model.add(LSTM(HIDDEN_LAYER_SIZE, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1))
model.add(Activation("sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(Xtrain, ytrain, batch_size=BATCH_SIZE, epochs=NUM_EPOCHS,
          validation_data=(Xtest, ytest))
model.save("garbled.h5")

OCR and image processing : Google OCR extracts page text; for object labeling, a Python script uses OpenCV grabCut to obtain foreground masks, then finds contours, selects the largest area, and fills smaller regions with fillConvexPoly to produce clean binary masks for CSV export.

FireEye aims to provide a lightweight, easily deployable automated testing suite, with future work focusing on AutoML hyper‑parameter tuning, layout anomaly detection, and holistic analysis of page elements to understand user flows.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning Python AI automated testing OCR Keras

Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.