How AutoML Transformed AR Scanning: Faster, Smaller, More Accurate Models
In 2020, the AR “scan‑for‑fortune” feature achieved a full AutoML rollout on the xNN‑Cloud platform, automating network architecture design and the entire model development pipeline, which cut Android inference time by over 50%, iOS by 30%, reduced model size, and boosted accuracy by 1.6% while handling billions of in‑client inferences.
Introduction
In 2020 the Chinese New Year "scan‑for‑fortune" activity completed a full AutoML deployment on the xNN‑Cloud platform. By automating network architecture design and the whole model‑development workflow, the solution dramatically reduced development effort while improving model precision by 1.6%, cutting Android inference time by 50% and iOS inference time by 30%.
Past Breakthroughs
Three years of visual algorithm work yielded two major milestones. The first came in 2018 when the xNN deep‑learning engine was introduced, enabling on‑device inference and relieving cloud‑service pressure while raising accuracy. The second breakthrough arrived in 2019 with the xNN‑x86 upgrade, achieving true end‑to‑end consistency between cloud and edge models.
This Year's Breakthroughs
To further improve user experience and development efficiency, AutoML was introduced. It automates both neural‑architecture search and the overall model‑R&D pipeline, addressing two key needs:
Significant network‑performance gains that would otherwise require high manual labor.
Rapid response to KA merchant demands and emerging public‑opinion issues.
Automated Network Architecture Design
The AutoML capability provided by xNN‑Cloud supports plug‑and‑play usage, requires no extensive code changes, and evaluates candidate architectures against device‑side metrics such as FLOPS, model size, and latency. The search process is user‑friendly: users only need to monitor whether the search meets expectations and compare resulting metrics.
Detection and Recognition Pipeline
The system adopts a two‑stage pipeline: a detection model first localizes candidate regions, then a recognition model determines whether the region contains a "fortune" character. Separating detection and recognition allows quick migration of the recognition model for custom classes without retraining the detector.
Search Objectives
During AutoML search, multiple objectives are considered simultaneously: accuracy, computational cost (FLOPS), and model size. Weighted trade‑off parameters (w_f, w_s) balance these factors to select the most suitable architecture for the AR scanning task.
Search Strategy
Given limited resources, the Hyperband algorithm is employed to allocate training steps efficiently. Early‑stage trials receive few steps, and only promising architectures proceed to longer training. The search is integrated with the ALPS‑AutoML Python SDK, supporting grid, random, Bayesian, and RACOS hyper‑parameter searches, as well as early‑stopping mechanisms such as Hyperband and MedianStop.
Model Performance
On Android devices (top‑50 models of 2019) inference time dropped by more than 50% to under 100 ms, while iOS (top‑20 models) saw a >30% reduction to under 40 ms. Model size decreased by ~80 KB, and accuracy improved by 1.6% despite the lower latency and smaller footprint.
Recognition Model vs MobileNet‑V3
When compared with a MobileNet‑V3 baseline of similar FLOPS, the NAS‑derived model uses only one‑tenth of the parameters, making it far more suitable for the strict resource constraints of the AR scanning scenario.
Fast Model Iteration (Model Quick‑Response)
The platform enables end‑to‑end data preparation, training, and evaluation within about 15 minutes, allowing custom class updates to be deployed in under an hour. This rapid turnaround was demonstrated for special “horse‑teacher” and merchant‑specific fortune characters.
Conclusion
Leveraging AutoML and automation on the xNN‑Cloud platform, the 2020 AR scanning models achieved superior speed, size, and accuracy while drastically reducing manual effort. Future work will tightly integrate real‑device testing into the search loop to further guide architecture decisions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
