Frontend Development 11 min read

Applying Self-Attention Based Machine Learning Model to Design-to-Code Layout Prediction

Vivo’s frontend team built a self‑attention‑based machine‑learning model that predicts web‑page layout types (column, row, or absolute) from node dimensions and positions, solving parent‑child and sibling relationships for design‑to‑code conversion, achieving 99.4% accuracy using over 20 k labeled, crawled, and generated samples, while outlining further enhancements.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
Applying Self-Attention Based Machine Learning Model to Design-to-Code Layout Prediction

This article discusses how vivo's frontend team applied machine learning with self-attention mechanism to solve the design-to-code (D2C) conversion problem, specifically for web page layout prediction.

Background: Traditional D2C tools can export styles from design mockups but cannot determine web page layout. The team developed a D2C tool that uses ML to automatically predict layout patterns.

Problem Definition: Web layout prediction requires solving two problems: (1) parent-child relationships between nodes, and (2) positional relationships between sibling nodes (vertical, horizontal, or absolute positioning).

Why Self-Attention: Unlike RNN/LSTM which process sequentially, self-attention allows parallel computation of all nodes in a sequence, significantly improving training efficiency. Each node can compute contextual information simultaneously by relating to all other nodes through global attention weights.

Model Design: Input data includes node width, height, x, y coordinates. Output is layout type: 'col' (vertical), 'row' (horizontal), or 'absolute'. The model uses self-attention to generate contextual embeddings, then feedforward neural networks for final layout classification.

Data Preparation: Three data sources were used: (1) manually labeled design mockups (highest quality), (2) crawled real web pages with CSS analysis, and (3) automatically generated data via a web generator. Approximately 20,000+ samples were collected.

Results: The model achieved 99.4% accuracy in layout prediction.

Optimization Directions: (1) Handling element wrapping in lists, (2) Improving grouping for non-intersecting nodes like icons and text in grids, (3) General layout recognition for functional components, (4) Using reinforcement learning for better data generation.

neural networkFrontend DevelopmentVivodesign-to-codeSelf-AttentionD2Clayout-predictionmachine-learning
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.