Frontend Development 21 min read

From UI Sketch to Code: Frontend Intelligence Generates 79% of Double‑11 Modules

This article explains how Alibaba's Front‑End Intelligent project automatically converts UI design images into production‑ready code, covering layout analysis, background and foreground processing, a fusion of traditional image algorithms with deep‑learning detection, GAN‑based complex‑background extraction, experimental results and real‑world deployment.

Taobao Frontend Technology
Taobao Frontend Technology
Taobao Frontend Technology
From UI Sketch to Code: Frontend Intelligence Generates 79% of Double‑11 Modules

Overview

The Front‑End Intelligent project, one of the four technical directions of Alibaba's Front‑End Committee, proved its value during the 2019 Double‑11 event, automatically generating 79.34% of the online code for new modules on Tmall and Taobao. This series shares the techniques and thoughts behind that achievement.

Why Use Images as Input

Images are the final deliverable, intuitive and deterministic, without upstream constraints.

Layout differences (e.g., listview, gridview) do not exist in visual drafts.

Image‑based pipelines support broader scenarios such as automated testing and competitor‑image reuse.

Layer stacking issues in design drafts are easier to handle when starting from images.

Layer Processing

In the D2C stack, the layer handling layer identifies element categories and extracts styles, providing data for the subsequent layout algorithm layer.

Layer processing diagram
Layer processing diagram

Layout Analysis

Layout analysis splits UI images into foreground and background. Background analysis uses machine‑vision algorithms to detect color, gradient direction, and connected regions, while foreground analysis employs deep‑learning models to merge and recognize GUI fragments.

Background analysis: analyze background color, gradient direction, and connected areas. Foreground analysis: use deep‑learning to organize, merge, and recognize GUI fragments.

Background Analysis

Step 1: Detect background blocks with edge detectors (Sobel, Laplacian, Canny) to separate solid‑color and gradient regions. The Laplacian template is illustrated below.

Laplacian template
Laplacian template

If a gradient background is detected, step 2 applies a flood‑fill algorithm to refine it.

<code>def fill_color_diffuse_water_from_img(task_out_dir, image, x, y, thres_up=(10,10,10), thres_down=(10,10,10), fill_color=(255,255,255)):
    # Obtain image height and width
    h, w = image.shape[:2]
    # Create a (h+2, w+2) single‑channel mask required by OpenCV
    mask = np.zeros([h+2, w+2], np.uint8)
    # Perform flood fill with specified thresholds and fixed‑range mode
    cv2.floodFill(image, mask, (x, y), fill_color, thres_down, thres_up, cv2.FLOODFILL_FIXED_RANGE)
    cv2.imwrite(task_out_dir + "/ui/tmp2.png", image)
    return image, mask</code>

Resulting images show the original and the processed output.

Background analysis result
Background analysis result

Foreground Analysis

Foreground processing focuses on component integrity: connected‑component analysis prevents fragmenting, followed by machine‑learning classification and merging until no residual fragments remain. An example of a complete item in a waterfall‑flow layout is shown.

Foreground detection example
Foreground detection example

Traditional edge‑gradient methods (CLAHE, Canny, morphological dilation, Douglas‑Peucker) are compared with deep‑learning detectors (Faster‑RCNN, YOLO, SSD). The fusion of both approaches yields high precision, recall, and localization (IOU).

Fusion Process

Run traditional image processing and deep‑learning detection in parallel, obtaining trbox and dlbox .

Filter trbox : keep boxes whose IOU with dlbox exceeds a threshold (e.g., 0.8).

Filter dlbox : discard boxes whose IOU with the retained trbox exceeds the threshold.

Adjust remaining dlbox edges to the nearest straight line within a pixel limit, without crossing trbox boundaries.

Output the union of filtered trbox and adjusted dlbox as the final result.

Metrics

True Positive (TP), True Negative (TN), False Positive (FP), False Negative (FN) are defined, and the standard formulas for Precision = TP/(TP+FP), Recall = TP/(TP+FN), and IOU = intersection/union are used to evaluate the methods.

Experimental Results

On 50 randomly sampled Xianyu waterfall‑flow images (96 cards total), traditional methods detected 65 cards, deep‑learning detected 97, and the fused approach detected 98 with superior precision, recall, and IOU. Detailed tables and charts illustrate the comparison.

Method comparison chart
Method comparison chart

Complex Background Content Extraction

Extracting specific content from complex backgrounds is challenging for both traditional image processing (low recall) and semantic segmentation (no pixel‑level restoration). The proposed pipeline uses a detection network for content recall, gradient‑based region judgment, and a SR‑GAN to restore elements in complex regions.

Detection network flow
Detection network flow

Why GAN?

The SR‑GAN incorporates a feature‑map loss to preserve high‑frequency details, an adversarial loss to reduce false detections, and can reconstruct pixel values behind semi‑transparent overlays—something pure segmentation cannot achieve.

Feature‑map loss illustration
Feature‑map loss illustration

Training Flow

SR‑GAN training pipeline
SR‑GAN training pipeline

Business Deployments

The solution is already used in the imgcook image pipeline (≈73% accuracy for generic scenes, >92% for specific card layouts) and in Taobao’s automated testing for major promotions, achieving >97% precision and recall.

Future Work

Planned improvements include richer layout identification (listview, gridview, waterfall), higher accuracy for small objects via Feature Pyramid Networks and Cascade R‑CNN, broader page coverage beyond Xianyu and Taobao, and an image‑sample generator to lower onboarding effort.

frontendcode generationmachine learningautomationGANimage processinglayout analysis
Taobao Frontend Technology
Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.