How Alibaba’s ‘Guess‑Draw Treasure’ Game Powers Real‑Time Sketch AI

During the 2023 Lunar New Year, Taobao Live launched the real‑time interactive game ‘Guess‑Draw Treasure’, which lets users sketch on mobile devices and have AI instantly recognize their drawings to win cash rewards; this article reveals the underlying AI techniques, challenges, model choices, datasets, and future plans.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s ‘Guess‑Draw Treasure’ Game Powers Real‑Time Sketch AI

1. Project Background

Taobao users can join “Guess‑Draw Treasure” in two ways. In the daily‑sharing mode, users draw objects prompted on a board; the AI identifies the category and suggests sharing the recognized content. In the live‑broadcast mode, a host reads twelve sketch prompts; users draw individually or in teams, and AI verifies each drawing, awarding cash red packets for correct answers.

2. Challenges

Compared with Google’s “Quick, Draw!” mini‑program, our AI sketch recognition faces additional challenges:

If we use a server‑side algorithm similar to Google’s, handling over 100,000 concurrent users would cause latency due to image upload, model inference, and result return, harming mobile interaction smoothness.

Using a lightweight on‑device model ensures speed but may sacrifice recognition accuracy.

We therefore chose a mobile‑centric recognition solution based on Alibaba’s self‑developed AliNN inference framework, and devised techniques to mitigate the accuracy loss of small models.

3. Recognition Scheme

The goal is to classify any sketch drawing. Sketches lack texture and often contain many visually similar shapes, making classification harder than ordinary image classification.

Traditional approaches extract hand‑crafted features (e.g., SIFT, HOG) and train an SVM classifier, but their accuracy is limited compared with modern deep neural networks.

Recent deep‑learning methods use Convolutional Neural Networks (CNN) or Recurrent Neural Networks (RNN). The CNN approach treats sketches as normalized black‑and‑white images and selects a suitable CNN architecture (e.g., NASNet for high accuracy, MobileNet for mobile performance). The RNN approach incorporates temporal stroke information (x, y coordinates and timestamp) and processes the sequence with 1‑D convolutions, LSTM layers, and a final softmax classification.

Our solution combines the strengths of both: we embed stroke order and timing into RGB images (e.g., the first stroke colored red, later strokes colored with gradients). This enriches the CNN input with temporal cues while keeping inference fast on mobile devices.

Experiments show that the RNN model achieves higher overall accuracy, while the CNN model runs faster on‑device. By merging temporal information into RGB inputs, our hybrid model attains >90% top‑3 accuracy on the test set, a ~5% gain over a pure CNN.

4. Datasets

Public sketch datasets are limited. The most notable are Google’s Quick, Draw! (345 categories, ~50 million sketches) and the TU‑Berlin dataset (250 categories, 80 sketches per class). Quick, Draw! offers abstract, fast‑drawn sketches, whereas TU‑Berlin provides more realistic, time‑intensive drawings.

For the Taobao Live scenario we selected the Quick, Draw! dataset for training, randomly sampling a portion for testing and using the remainder to train our time‑aware CNN model.

5. Future Development

The “Guess‑Draw Treasure” activity attracted over 100 million interactions, demonstrating strong user interest in fun, AI‑driven mobile experiences. Going forward we will continue to improve the AI sketch recognizer to support more object categories and integrate it into our self‑developed PixelAI SDK, which provides on‑device capabilities such as face, gesture, segmentation, SLAM, and more.

PixelAI has already powered other interactive games like “Smile Red Packet” during Double 11 and the AR “World‑Goal” feature during the World Cup, and will serve as the foundation for many future intelligent mobile experiences.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AlibabaCNNRNNmobile deep learningAI sketch recognitionQuick Draw dataset
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.