Artificial Intelligence 4 min read

Machine Learning-Based Test Case Step Recommendation: Data Preprocessing, N‑gram, CBOW, and RNN/LSTM Model Construction

This article explains how to use machine‑learning techniques—including data preprocessing, N‑gram, CBOW, and various RNN/LSTM models—to automatically recommend the next function in a test‑case step sequence, improving writing speed and efficiency for developers.

360 Quality & Efficiency
360 Quality & Efficiency
360 Quality & Efficiency
Machine Learning-Based Test Case Step Recommendation: Data Preprocessing, N‑gram, CBOW, and RNN/LSTM Model Construction

Background and significance: Test case steps are treated as a time series, and the goal is to apply machine‑learning algorithms to recommend the next function(s) when writing a test case, typically presenting four candidate functions to accelerate and improve the writing process.

Algorithm composition: The solution consists of three main stages: data preprocessing, model construction, and model testing.

Data preprocessing: Each function ID represents a test‑case step (analogous to a word in language models). IDs are remapped to a continuous range to reduce dimensionality of embeddings, one‑hot encoding requires specifying num_classes for consistency between training and test labels, and random shuffling of the dataset is performed to prevent over‑fitting.

N‑gram model construction: A vocabulary is built assuming each step depends on the two preceding steps. Frequency counts are recorded as A (individual function occurrence), B (pair occurrence), and C (triple occurrence). Smoothing techniques are applied to handle unseen n‑grams.

CBOW model construction and testing: A Continuous Bag‑of‑Words model is trained on the same vocabulary to predict target functions from surrounding context, followed by evaluation on a held‑out test set.

RNN model construction with Keras: The network architecture is defined, selecting appropriate loss functions such as categorical_crossentropy for multi‑class classification, MSE for regression, and binary_crossentropy for binary tasks. Various recurrent structures are built, including a basic RNN, a bidirectional RNN, a deep RNN, and a bidirectional LSTM, each illustrated with corresponding diagrams and tested for performance.

Dataset types and validation: The workflow distinguishes three supervised learning datasets: training set (used to fit the model), validation set (a subset of training data to monitor over‑fitting and tune hyper‑parameters), and test set (used to assess final model accuracy). Cross‑validation is also introduced as a technique to obtain more reliable performance estimates.

Conclusion: Proper preprocessing, careful construction of n‑gram/CBOW/RNN models, and rigorous evaluation using appropriate datasets collectively enable effective recommendation of subsequent test‑case functions, thereby improving development efficiency.

machine learningdata preprocessingLSTMtest case recommendationRNNN-gramCBOW
360 Quality & Efficiency
Written by

360 Quality & Efficiency

360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.