Mobile Development 10 min read

SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite

SmileAR, iQIYI’s self‑developed mobile AR platform powered by TensorFlow Lite, delivers real‑time face, body and gesture recognition across iQIYI’s apps through MobileNet‑based models, quantization‑aware training, multi‑task learning and encrypted SDKs, achieving fast, lightweight, cross‑platform AR experiences for millions of users.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite

SmileAR is iQIYI’s self‑developed mobile AR solution built on TensorFlow Lite. It has been deployed in multiple iQIYI products, including the iQIYI APP (over 100 million daily active users), the children’s app QibaBu, and the short‑video platform Gingerbread.

iQIYI, one of China’s largest online video companies, emphasizes technology‑driven innovation. Its AI/AR research aims to continuously improve user entertainment experiences.

SmileAR provides a suite of core algorithms such as face detection, facial key‑point detection, body key‑point detection, portrait segmentation, gesture recognition, and object recognition. These algorithms are packaged into AR applications like beauty filters, body shaping, dance‑battle, and scan‑to‑play features.

Facial Key‑Point Detection and Tracking – Accurate facial key‑points enable functions such as slimming, makeup, virtual accessories, and other beauty effects. These capabilities are already integrated into iQIYI’s short‑video shooting pipeline.

Gesture Tracking and Recognition – Implemented with an SSD detector and MobileNet backbone, quantization techniques accelerate real‑time gesture detection on mobile devices. The feature is used in iQIYI’s client, live‑streaming devices, and Gingerbread short videos.

Body Key‑Point Recognition – Deployed in the “Cute Baby Dance Studio” module of QibaBu, the algorithm evaluates children’s dance imitation, calculates similarity with a reference pose, and triggers visual effects when the performance meets a threshold.

Mobile‑Side Algorithm Optimization

To meet the high computational demand of deep‑learning inference on mobile devices, SmileAR adopts several acceleration strategies:

• Conventional Acceleration : Replace heavy backbones with MobileNet V2, reduce input resolution, and prune channels to boost speed.

• Quantization‑Aware Training : Add two lines of code in TensorFlow to enable quantization, yielding significant CPU inference speed gains on Snapdragon, Kirin, and Helio chips (see Figure 6 for latency comparison).

• Key‑Point Jitter Reduction : Fuse multi‑frame information using a Gaussian Mixture Model to stabilize facial key‑points across frames (Figure 7).

• Multi‑Task Learning : Jointly train face detection, pose estimation, and expression recognition, improving accuracy (Figure 8). Similar multi‑task designs are applied to body key‑point models, with auxiliary heat‑map loss to avoid local minima (Figure 9).

• Hard Mining : Focus training on difficult samples (large angles, closed eyes, wide mouth) to reduce false detections and enhance stability (Figure 10).

Cross‑Platform Deployment

TensorFlow Lite’s C++ API is used to compile static libraries for Android, iOS, and Windows. Platform‑specific wrappers (Java for Android, Objective‑C for iOS) expose a unified SDK, enabling native code reuse and simplifying integration for downstream product teams.

SDK Authentication and Model Protection

Model files are encrypted with iQIYI’s proprietary algorithm, and license verification is added to safeguard intellectual property.

Package Size Management

Only required TensorFlow Lite operators are linked by customizing the MutableOpResolver, reducing binary size without affecting functionality.

Conclusion and Outlook

This paper presented SmileAR as a case study of TensorFlow Lite’s application at iQIYI. TFLite’s cross‑platform support, execution efficiency, and tooling (e.g., benchmark) enable rapid deployment of AI‑driven AR features across multiple apps. Future work includes video‑level stabilization, further mobile acceleration, precision optimization, and GPU‑based inference on diverse platforms.

cross-platformComputer Visionmodel optimizationmobile AIARTensorFlow LiteiQIYI
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.