Frontend Development 34 min read

How to Build Front‑End AI Experiments with Pipcook: From Setup to Real‑World Image Classification

This comprehensive guide walks front‑end developers through preparing hardware and OS, installing Python and Node environments, launching Pipcook's visual board, running handwritten digit and image classification experiments, creating and augmenting training samples, configuring pipelines, training models, and understanding deployment, all using the Pipcook framework.

Taobao Frontend Technology

May 25, 2020

How to Build Front‑End AI Experiments with Pipcook: From Setup to Real‑World Image Classification

Environment Preparation

For beginners, a laptop is recommended for its portability; modern thin laptops such as Xiaomi Pro with Max 150 GPU can handle typical machine‑learning experiments. MacBook Pro users may consider AMD GPUs for PlaidML support, though PlaidML has limited RNN acceleration. For more demanding tasks, a desktop with at least a 6‑core AMD CPU, 32 GB RAM, a capable GPU (large VRAM), and a fast SSD (512 GB) is advisable.

Desktop and Laptop

Desktops provide better cooling for long‑running complex models that may train for days.

GPU Selection

AMD GPUs work well with ROCm; otherwise, Nvidia GPUs are a safe choice. Larger VRAM and bandwidth improve training of large models.

Storage

Use a high‑speed SSD for the system and a secondary HDD for data and model parameters. Choose a power supply that can handle peak consumption and potential multi‑GPU setups.

Operating System

Windows works fine with Anaconda; however, Ubuntu Linux offers native support for most machine‑learning ecosystems and avoids compatibility issues.

Python Environment

Python tutorial: https://docs.python.org/zh-cn/3.8/tutorial/index.html

Installation packages:

MacOS: python‑3.7.7‑macosx10.9.pkg

Windows: python‑3.7.7‑embed‑amd64.zip

Modules installation guide: https://docs.python.org/zh-cn/3.8/installing/index.html

Node Environment

Node tutorial: https://nodejs.org/zh-cn/

Installation packages:

MacOS: node‑v12.16.2.pkg

Windows 64‑bit: node‑v12.16.2‑x64.msi

Windows 32‑bit: node‑v12.16.2‑x86.msi

Linux 64‑bit: node‑v12.16.2‑linux‑x64.tar.xz

Install Pipcook CLI:

<span>$ npm install -g @pipcook/pipcook-cli</span>

Ensure Python > 3.6 and Node.js > 12.x are installed, then run the command to set up the full Pipcook development environment.

Quick Experiment

Launch Visual Experiment Environment: Pipboard

Run the following commands to create a project, initialize Pipcook, and start the visual board:

<span>$ mkdir pipcook-example && cd pipcook-example</span>
<span>$ pipcook init</span>
<span>$ pipcook board</span>

Sample output shows the Egg.js server starting on http://127.0.0.1:7001.

Handwritten Digit Recognition

In the browser, select the "MNIST Handwritten Digit Recognition" experiment and click "Try Here". Draw a digit, click the "Predict" button, and the model will output the predicted number (e.g., "7").

Image Classification

Select the "Image Classification for Front‑end Assets" experiment, upload an image (e.g., a brand logo), and after processing, the result panel shows a JSON with the predicted class "brandLogo".

Practice Method

Problem Definition

The goal is to recognize UI widgets (buttons, inputs, etc.) from design mockups using computer‑vision, a simplified version of the broader imgcook.com objective of reconstructing front‑end code from designs.

Problem Analysis

Widget detection can be treated as an object‑detection task. Mask RCNN is suggested for generating candidate regions (semantic segmentation) followed by classification.

Data Organization

High‑quality labeled data is essential. The MNIST dataset format is used as a reference. Example JSON configuration for Pipcook data collection:

<span>{</span>
<span>  "plugins": {</span>
<span>    "dataCollect": {</span>
<span>      "package": "@pipcook/plugins-mnist-data-collect",</span>
<span>      "params": {</span>
<span>        "trainCount": 8000,</span>
<span>        "testCount": 2000</span>
<span>      }</span>
<span>    }</span>
<span>  }</span>
<span>}</span>

Relevant npm packages for the data‑collect plugin include @tensorflow/tfjs-node-gpu, jimp, and mnist. The MNIST npm package provides 10 000 digit samples.

Sample Creation

Manual labeling is labor‑intensive; instead, use Puppeteer to render HTML buttons and capture screenshots automatically.

<span>const puppeteer = require("puppeteer");</span>
<span>const fs = require("fs");</span>
<span>const Q = require("Q");</span>
<span>function delay(ms) {</span>
<span>  var deferred = Q.defer();</span>
<span>  setTimeout(deferred.resolve, ms);</span>
<span>  return deferred.promise;</span>
<span>}</span>
<span>const urls = [</span>
<span>  "file:///.../page1.html",</span>
<span>  "file:///.../page2.html",</span>
<span>  "file:///.../page3.html",</span>
<span>  "file:///.../page4.html",</span>
<span>  "file:///.../page5.html"</span>
<span>];</span>
<span>(async () => {</span>
<span>  const browser = await puppeteer.launch({ headless: true, args: ["--no-sandbox", "--disable-gpu"] });</span>
<span>  const page = await browser.newPage();</span>
<span>  await page.setViewport({ width: 375, height: 812, isMobile: true });</span>
<span>  let counter = 0;</span>
<span>  for (url of urls) {</span>
<span>    await page.goto(url, { timeout: 0, waitUntil: "networkidle0" });</span>
<span>    await delay(100);</span>
<span>    const btnElements = await page.$$("button");</span>
<span>    for (btn of btnElements) {</span>
<span>      const btnData = await btn.screenshot({ encoding: "binary", type: "jpeg", quality: 90 });</span>
<span>      const fn = "data/btn" + counter + ".jpg";</span>
<span>      Q.nfcall(fs.writeFileSync, fn, btnData);</span>
<span>      counter++;</span>
<span>    }</span>
<span>  }</span>
<span>  await page.close();</span>
<span>  await browser.close();</span>
<span>})();</span>

After capturing button images, use the gm (GraphicsMagick) library to resize them to 28 × 28 pixels, convert to grayscale, and optionally add random characters for data augmentation.

<span>const gm = require("gm");</span>
<span>const fs = require("fs");</span>
<span>const path = require("path");</span>
<span>const basePath = "./data/";</span>
<span>const chars = ["0","1","2","3","4","5","6","7","8","9","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"];</span>
<span>let randomRange = (min, max) => Math.random() * (max - min) + min;</span>
<span>let randomChars = (n) => { let s = ""; for (let i=0;i<n;i++) s += chars[Math.ceil(Math.random()*35)]; return s; };</span>
<span>const files = fs.readdirSync(basePath);
<span>for (let file of files) {</span>
<span>  const filePath = path.join(basePath, file);
<span>  gm(filePath)
<span>    .quality(100)
<span>    .gravity("Center")
<span>    .drawText(randomRange(-5,5), 0, randomChars(5))
<span>    .channel("Gray")
<span>    .resize(28)
<span>    .extent(28,28)
<span>    .write(filePath, err => { if (!err) console.log("At " + filePath + " done!"); else console.log(err); });
<span>}</span>

Further augmentation can be done by generating multiple variants per image:

<span>for (let file of files) {</span>
<span>  for (let i=0;i<3;i++) {</span>
<span>    const rawPath = path.join(basePath, file);
<span>    const newPath = path.join(basePath, i + file);
<span>    gm(rawPath)
<span>      .quality(100)
<span>      .gravity("Center")
<span>      .drawText(randomRange(-5,5), 0, randomChars(5))
<span>      .channel("Gray")
<span>      .resize(28)
<span>      .extent(28,28)
<span>      .write(newPath, err => { if (!err) console.log("At " + newPath + " done!"); else console.log(err); });
<span>  }</span>
<span>}</span>

Feature Engineering

Keypoint, SIFT, and other descriptors can be applied to capture invariant visual features for improved model robustness.

Model Training

Pipcook pipeline configuration (JSON) for MNIST image classification:

<span>{</span>
<span>  "plugins": {</span>
<span>    "dataCollect": { "package": "@pipcook/plugins-mnist-data-collect", "params": { "trainCount": 8000, "testCount": 2000 } },</span>
<span>    "dataAccess": { "package": "@pipcook/plugins-pascalvoc-data-access" },</span>
<span>    "dataProcess": { "package": "@pipcook/plugins-image-data-process", "params": { "resize": [28,28] } },</span>
<span>    "modelDefine": { "package": "@pipcook/plugins-tfjs-simplecnn-model-define" },</span>
<span>    "modelTrain": { "package": "@pipcook/plugins-image-classification-tfjs-model-train", "params": { "epochs": 15 } },</span>
<span>    "modelEvaluate": { "package": "@pipcook/plugins-image-classification-tfjs-model-evaluate" }</span>
<span>  }</span>
<span>}</span>

Run training:

<span>$ pipcook run examples/pipelines/mnist-image-classification.json</span>

Start the visual board to test predictions:

<span>$ pipcook board</span>

Principle Analysis

The pipeline consists of seven plugin types that together form the machine‑learning workflow. Understanding each plugin’s role—data collection, access, processing, model definition, training, evaluation—allows developers to replace or extend functionality for specific front‑end problems. Pipcook supports VOC format for vision tasks and CSV for NLP. Data quality, proper labeling, and augmentation directly affect model performance. Deployment considerations include choosing CPU containers for simple inference or GPU/heterogeneous containers (e.g., NVIDIA CUDA) for heavy models.

Conclusion

This article presented a step‑by‑step front‑end intelligent development workflow using Pipcook, covering environment setup, quick experiments, data preparation, sample generation, augmentation, feature engineering, model training, and deployment. Future posts will explore NLP pipelines in the same structured manner.

Follow the account and reply "藏经阁" to receive additional resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

image classification machine learning data augmentation nodejs

Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.