Artificial Intelligence 30 min read

How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation

This article explains how to use deep‑learning object detection to automatically recognize UI components in design drafts, generate a smart JSON description, and convert it into component‑based front‑end code, covering problem analysis, dataset preparation, algorithm selection, model training, evaluation, and deployment.

Taobao Frontend Technology

Jun 2, 2020

How AI Can Auto‑Detect UI Components for Seamless Front‑End Code Generation

Introduction

Smart code generation platforms such as imgcook can convert Sketch, PSD, or static images into maintainable front‑end code, but the output consists only of generic tags like div, img, and span. To support component‑based development, it is necessary to identify UI components (e.g., SearchBar, Button, Tab) directly from design drafts.

Problem Definition

Design drafts contain many reusable components, while some native elements (StatusBar, Navbar, Keyboard) do not need code generation. The goal is to detect component elements, determine their type, position, and size, and fill the smart field in the JSON description so that downstream DSL conversion can produce component‑based code.

Solution Overview

We treat UI component detection as a typical object‑detection problem and apply deep‑learning methods. The workflow includes:

Design‑draft acquisition (Sketch, PSD, images)

Sample preparation and annotation

Model selection and training

Model evaluation

Model service deployment

Application of the model to generate smart JSON and final code.

Dataset Preparation

Two sources of samples are used:

Alibaba‑internal UI screenshots (≈25 000 images, 10 categories, 49 120 components)

Automatically generated pages with random components (10 categories)

Samples are filtered by size, de‑duplicated, and annotated using VOC or COCO formats. Semi‑automatic labeling is applied for components like StatusBar and Navbar.

Algorithm Selection

Object‑detection methods are divided into traditional machine‑learning approaches (e.g., Haar, SIFT, HOG) and deep‑learning approaches (one‑stage: SSD, YOLO, RetinaNet; two‑stage: R‑CNN, Fast R‑CNN, Faster R‑CNN). Considering accuracy requirements, Faster R‑CNN from Detectron2 is chosen.

Model Training with Detectron2

from detectron2.config import get_cfg
from detectron2.engine import DefaultTrainer
cfg = get_cfg()
cfg.merge_from_file("./configs/faster_rcnn_R_50_C4_3x.yaml")
cfg.DATASETS.TRAIN = ("train_dataset",)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 10
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=True)
trainer.train()

Evaluation

Model performance is measured by mean Average Precision (mAP) and FPS. On the combined Alibaba UI and synthetic dataset, mAP reaches around 75 % (AP@IoU=0.5:0.95 = 0.772, AP@IoU=0.5 = 0.951).

Average Precision (AP) @[ IoU=0.50:0.95 ] = 0.772
Average Precision (AP) @[ IoU=0.50 ] = 0.951
Average Precision (AP) @[ IoU=0.75 ] = 0.915

Model Service Deployment

A prediction service is built using Detectron2's DefaultPredictor. The service receives an image, runs inference, and returns component type and bounding‑box information in JSON format, which can be merged into the D2C schema.

from detectron2.engine.defaults import DefaultPredictor
cfg.MODEL.WEIGHTS = "model_final.pth"
predictor = DefaultPredictor(cfg)

def predict(image_path):
    im = cv2.imread(image_path)
    outputs = predictor(im)
    return outputs

Future Work

To improve generalization beyond Alibaba‑specific designs, we plan to incorporate larger public datasets such as Rico and ReDraw, enhance synthetic sample generation, and develop metrics for evaluating synthetic sample quality.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning AI object detection frontend automation Pipcook UI detection

Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.