Detect Front‑End UI Components with Pipcook: A Complete Object‑Detection Guide
This tutorial walks you through using Pipcook to train an object‑detection model that automatically identifies and locates front‑end UI components in screenshots, covering data preparation in Pascal VOC format, pipeline configuration, model training, and inference with sample code.
Background
In front‑end development you may have many UI screenshots and need an automatic way to recognize which components (buttons, switches, inputs, etc.) appear and where they are located. This task is known as object detection in deep learning.
Scenario Example
An example image contains multiple components such as buttons, switches, and input fields. After training, the model predicts the following JSON:
{
"boxes": [
[83, 31, 146, 71],
[210, 48, 256, 78],
[403, 30, 653, 72],
[717, 41, 966, 83]
],
"classes": [0, 1, 2, 2],
"scores": [0.95, 0.93, 0.96, 0.99]
}The corresponding label map is:
{
"button": 0,
"switch": 1,
"input": 2
}Explanation:
boxes : coordinates of each detected component (xmin, ymin, xmax, ymax).
classes : numeric class IDs that map to component types via the label map.
scores : confidence scores; only results above a chosen threshold are kept.
Data Preparation
Object‑detection models require datasets in a standard format. This tutorial uses the Pascal VOC format, which stores each image together with an XML annotation file. A typical directory layout is:
train/
1.jpg
1.xml
2.jpg
2.xml
...
validation/
1.jpg
1.xml
...
test/
1.jpg
1.xml
...Each XML file contains fields such as <folder>, <filename>, <size>, and multiple <object> entries with <name> (component type) and <bndbox> (position).
Start Training
With the dataset ready, create a Pipcook pipeline JSON that strings together the required plugins:
{
"plugins": {
"dataCollect": {
"package": "@pipcook/plugins-object-detection-pascalvoc-data-collect",
"params": { "url": "http://ai-sample.oss-cn-hangzhou.aliyuncs.com/pipcook/datasets/component-recognition-detection/component-recognition-detection.zip" }
},
"dataAccess": { "package": "@pipcook/plugins-coco-data-access" },
"modelDefine": { "package": "@pipcook/plugins-detectron-fasterrcnn-model-define" },
"modelTrain": { "package": "@pipcook/plugins-detectron-model-train", "params": { "steps": 100000 } },
"modelEvaluate": { "package": "@pipcook/plugins-detectron-model-evaluate" }
}
}Run the pipeline on a machine with an NVIDIA GPU and CUDA 10.2:
pipcook run object-detection.json --verbose --tunaTraining logs show loss decreasing, e.g.:
[06/28 12:28:32] iter: 100000 total_loss: 0.032 loss_cls: 0.122 ...After training, Pipcook generates an output npm package. Install dependencies and run inference:
cd output
BOA_TUNA=1 npm install
const predict = require('./output');
(async () => {
const result = await predict('./test.jpg');
console.log(result);
})();The prediction result contains boxes, classes, and scores as described earlier.
Creating Your Own Dataset
To build a custom dataset, follow three steps:
Collect images : Gather raw UI screenshots without annotations.
Annotate : Use tools such as labelImg to draw bounding boxes and assign component labels. Example screenshot of labelImg is shown below.
Train : Organize the annotated files into the Pascal VOC folder structure and run the Pipcook pipeline.
Summary
You have learned how to detect multiple front‑end components in an image using Pipcook, from data preparation to model training and inference. The next tutorial will explore image style transfer with Pipcook, such as converting photos to oil‑painting style.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
