How to Build Front‑End AI Experiments with Pipcook: From Setup to Real‑World Image Classification
This comprehensive guide walks front‑end developers through preparing hardware and OS, installing Python and Node environments, launching Pipcook's visual board, running handwritten digit and image classification experiments, creating and augmenting training samples, configuring pipelines, training models, and understanding deployment, all using the Pipcook framework.
Environment Preparation
For beginners, a laptop is recommended for its portability; modern thin laptops such as Xiaomi Pro with Max 150 GPU can handle typical machine‑learning experiments. MacBook Pro users may consider AMD GPUs for PlaidML support, though PlaidML has limited RNN acceleration. For more demanding tasks, a desktop with at least a 6‑core AMD CPU, 32 GB RAM, a capable GPU (large VRAM), and a fast SSD (512 GB) is advisable.
Desktop and Laptop
Desktops provide better cooling for long‑running complex models that may train for days.
GPU Selection
AMD GPUs work well with ROCm; otherwise, Nvidia GPUs are a safe choice. Larger VRAM and bandwidth improve training of large models.
Storage
Use a high‑speed SSD for the system and a secondary HDD for data and model parameters. Choose a power supply that can handle peak consumption and potential multi‑GPU setups.
Operating System
Windows works fine with Anaconda; however, Ubuntu Linux offers native support for most machine‑learning ecosystems and avoids compatibility issues.
Python Environment
Python tutorial: https://docs.python.org/zh-cn/3.8/tutorial/index.html
Installation packages:
MacOS: python‑3.7.7‑macosx10.9.pkg
Windows: python‑3.7.7‑embed‑amd64.zip
Modules installation guide: https://docs.python.org/zh-cn/3.8/installing/index.html
Node Environment
Node tutorial: https://nodejs.org/zh-cn/
Installation packages:
MacOS: node‑v12.16.2.pkg
Windows 64‑bit: node‑v12.16.2‑x64.msi
Windows 32‑bit: node‑v12.16.2‑x86.msi
Linux 64‑bit: node‑v12.16.2‑linux‑x64.tar.xz
Install Pipcook CLI:
<code><span>$ npm install -g @pipcook/pipcook-cli</span></code>Ensure Python > 3.6 and Node.js > 12.x are installed, then run the command to set up the full Pipcook development environment.
Quick Experiment
Launch Visual Experiment Environment: Pipboard
Run the following commands to create a project, initialize Pipcook, and start the visual board:
<code><span>$ mkdir pipcook-example && cd pipcook-example</span>
<span>$ pipcook init</span>
<span>$ pipcook board</span></code>Sample output shows the Egg.js server starting on
http://127.0.0.1:7001.
Handwritten Digit Recognition
In the browser, select the "MNIST Handwritten Digit Recognition" experiment and click "Try Here". Draw a digit, click the "Predict" button, and the model will output the predicted number (e.g., "7").
Image Classification
Select the "Image Classification for Front‑end Assets" experiment, upload an image (e.g., a brand logo), and after processing, the result panel shows a JSON with the predicted class "brandLogo".
Practice Method
Problem Definition
The goal is to recognize UI widgets (buttons, inputs, etc.) from design mockups using computer‑vision, a simplified version of the broader imgcook.com objective of reconstructing front‑end code from designs.
Problem Analysis
Widget detection can be treated as an object‑detection task. Mask RCNN is suggested for generating candidate regions (semantic segmentation) followed by classification.
Data Organization
High‑quality labeled data is essential. The MNIST dataset format is used as a reference. Example JSON configuration for Pipcook data collection:
<code><span>{</span>
<span> "plugins": {</span>
<span> "dataCollect": {</span>
<span> "package": "@pipcook/plugins-mnist-data-collect",</span>
<span> "params": {</span>
<span> "trainCount": 8000,</span>
<span> "testCount": 2000</span>
<span> }</span>
<span> }</span>
<span> }</span>
<span>}</span></code>Relevant npm packages for the data‑collect plugin include
@tensorflow/tfjs-node-gpu,
jimp, and
mnist. The MNIST npm package provides 10 000 digit samples.
Sample Creation
Manual labeling is labor‑intensive; instead, use Puppeteer to render HTML buttons and capture screenshots automatically.
<code><span>const puppeteer = require("puppeteer");</span>
<span>const fs = require("fs");</span>
<span>const Q = require("Q");</span>
<span>function delay(ms) {</span>
<span> var deferred = Q.defer();</span>
<span> setTimeout(deferred.resolve, ms);</span>
<span> return deferred.promise;</span>
<span>}</span>
<span>const urls = [</span>
<span> "file:///.../page1.html",</span>
<span> "file:///.../page2.html",</span>
<span> "file:///.../page3.html",</span>
<span> "file:///.../page4.html",</span>
<span> "file:///.../page5.html"</span>
<span>];</span>
<span>(async () => {</span>
<span> const browser = await puppeteer.launch({ headless: true, args: ["--no-sandbox", "--disable-gpu"] });</span>
<span> const page = await browser.newPage();</span>
<span> await page.setViewport({ width: 375, height: 812, isMobile: true });</span>
<span> let counter = 0;</span>
<span> for (url of urls) {</span>
<span> await page.goto(url, { timeout: 0, waitUntil: "networkidle0" });</span>
<span> await delay(100);</span>
<span> const btnElements = await page.$$("button");</span>
<span> for (btn of btnElements) {</span>
<span> const btnData = await btn.screenshot({ encoding: "binary", type: "jpeg", quality: 90 });</span>
<span> const fn = "data/btn" + counter + ".jpg";</span>
<span> Q.nfcall(fs.writeFileSync, fn, btnData);</span>
<span> counter++;</span>
<span> }</span>
<span> }</span>
<span> await page.close();</span>
<span> await browser.close();</span>
<span>})();</span></code>After capturing button images, use the
gm(GraphicsMagick) library to resize them to 28 × 28 pixels, convert to grayscale, and optionally add random characters for data augmentation.
<code><span>const gm = require("gm");</span>
<span>const fs = require("fs");</span>
<span>const path = require("path");</span>
<span>const basePath = "./data/";</span>
<span>const chars = ["0","1","2","3","4","5","6","7","8","9","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"];</span>
<span>let randomRange = (min, max) => Math.random() * (max - min) + min;</span>
<span>let randomChars = (n) => { let s = ""; for (let i=0;i<n;i++) s += chars[Math.ceil(Math.random()*35)]; return s; };</span>
<span>const files = fs.readdirSync(basePath);
<span>for (let file of files) {</span>
<span> const filePath = path.join(basePath, file);
<span> gm(filePath)
<span> .quality(100)
<span> .gravity("Center")
<span> .drawText(randomRange(-5,5), 0, randomChars(5))
<span> .channel("Gray")
<span> .resize(28)
<span> .extent(28,28)
<span> .write(filePath, err => { if (!err) console.log("At " + filePath + " done!"); else console.log(err); });
<span>}</span></code>Further augmentation can be done by generating multiple variants per image:
<code><span>for (let file of files) {</span>
<span> for (let i=0;i<3;i++) {</span>
<span> const rawPath = path.join(basePath, file);
<span> const newPath = path.join(basePath, i + file);
<span> gm(rawPath)
<span> .quality(100)
<span> .gravity("Center")
<span> .drawText(randomRange(-5,5), 0, randomChars(5))
<span> .channel("Gray")
<span> .resize(28)
<span> .extent(28,28)
<span> .write(newPath, err => { if (!err) console.log("At " + newPath + " done!"); else console.log(err); });
<span> }</span>
<span>}</span></code>Feature Engineering
Keypoint, SIFT, and other descriptors can be applied to capture invariant visual features for improved model robustness.
Model Training
Pipcook pipeline configuration (JSON) for MNIST image classification:
<code><span>{</span>
<span> "plugins": {</span>
<span> "dataCollect": { "package": "@pipcook/plugins-mnist-data-collect", "params": { "trainCount": 8000, "testCount": 2000 } },</span>
<span> "dataAccess": { "package": "@pipcook/plugins-pascalvoc-data-access" },</span>
<span> "dataProcess": { "package": "@pipcook/plugins-image-data-process", "params": { "resize": [28,28] } },</span>
<span> "modelDefine": { "package": "@pipcook/plugins-tfjs-simplecnn-model-define" },</span>
<span> "modelTrain": { "package": "@pipcook/plugins-image-classification-tfjs-model-train", "params": { "epochs": 15 } },</span>
<span> "modelEvaluate": { "package": "@pipcook/plugins-image-classification-tfjs-model-evaluate" }</span>
<span> }</span>
<span>}</span></code>Run training:
<code><span>$ pipcook run examples/pipelines/mnist-image-classification.json</span></code>Start the visual board to test predictions:
<code><span>$ pipcook board</span></code>Principle Analysis
The pipeline consists of seven plugin types that together form the machine‑learning workflow. Understanding each plugin’s role—data collection, access, processing, model definition, training, evaluation—allows developers to replace or extend functionality for specific front‑end problems. Pipcook supports VOC format for vision tasks and CSV for NLP. Data quality, proper labeling, and augmentation directly affect model performance. Deployment considerations include choosing CPU containers for simple inference or GPU/heterogeneous containers (e.g., NVIDIA CUDA) for heavy models.
Conclusion
This article presented a step‑by‑step front‑end intelligent development workflow using Pipcook, covering environment setup, quick experiments, data preparation, sample generation, augmentation, feature engineering, model training, and deployment. Future posts will explore NLP pipelines in the same structured manner.
Follow the account and reply "藏经阁" to receive additional resources.
Taobao Frontend Technology
The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.