Mobile Development 10 min read

How Alipay’s xNN Engine Brings Deep Learning to Mobile Apps

This article explains how Alipay’s xNN deep‑learning engine tackles the challenges of deploying AI on billions of mobile devices by using aggressive model compression, a lightweight SDK, and joint algorithm‑ and instruction‑level optimizations to achieve high accuracy, tiny package size, and real‑time performance.

Alibaba Cloud Developer

Jun 15, 2018

How Alipay’s xNN Engine Brings Deep Learning to Mobile Apps

Deep Learning – Cloud or Mobile?

Recent breakthroughs in deep learning (DL) have driven image, speech, and language tasks, but deploying DL on mobile faces challenges such as device diversity and strict app package size limits.

Two Major Challenges

1. Wide device span – Alipay serves billions of users across many phone models, requiring DL solutions that run efficiently on low‑end devices.

2. Package size constraints – Adding a DL model can bloat the app; even with dynamic delivery, model size must stay minimal.

Five Technical Goals of xNN

Lightweight models via aggressive compression while preserving accuracy.

Small engine – a slim mobile SDK.

Speed – algorithm‑ and instruction‑level optimizations.

Universality – optimized for CPUs rather than GPUs, supporting CNN, DNN, RNN, LSTM, etc.

Ease of use – a toolchain that lets algorithm engineers convert cloud models to mobile without deep expertise.

Main Features

xNN provides end‑to‑end lifecycle support from model compression (xqueeze) to deployment and runtime monitoring. The backend toolchain compresses models using neuron pruning, synapse pruning, quantization, network transform, and adaptive Huffman coding, achieving up to 50× size reduction with negligible accuracy loss.

The compressed model can be packaged in the app or delivered on‑demand.

Performance optimizations combine algorithmic sparsity with custom instruction kernels, auto‑combining layers, and fine‑grained thread scheduling, yielding high cache hit rates and low memory traffic.

On a Qualcomm 820 CPU, xNN runs a SqueezeNet‑based classifier at 29.4 FPS; on an iPhone 7 A10 CPU it reaches 52.6 FPS, surpassing Core ML.

Business Deployment

Alipay has integrated xNN in features such as “AR Scan” where over 90 % of Android and iOS devices perform on‑device object classification, with model sizes under 100 KB and SDK overhead around 200 KB.

Since launch, xNN has spurred numerous mobile DL projects within Ant Group.