How Alibaba’s ‘Guess‑Draw Treasure’ Game Powers Real‑Time Sketch AI
During the 2023 Lunar New Year, Taobao Live launched the real‑time interactive game ‘Guess‑Draw Treasure’, which lets users sketch on mobile devices and have AI instantly recognize their drawings to win cash rewards; this article reveals the underlying AI techniques, challenges, model choices, datasets, and future plans.
1. Project Background
Taobao users can join “Guess‑Draw Treasure” in two ways. In the daily‑sharing mode, users draw objects prompted on a board; the AI identifies the category and suggests sharing the recognized content. In the live‑broadcast mode, a host reads twelve sketch prompts; users draw individually or in teams, and AI verifies each drawing, awarding cash red packets for correct answers.
2. Challenges
Compared with Google’s “Quick, Draw!” mini‑program, our AI sketch recognition faces additional challenges:
If we use a server‑side algorithm similar to Google’s, handling over 100,000 concurrent users would cause latency due to image upload, model inference, and result return, harming mobile interaction smoothness.
Using a lightweight on‑device model ensures speed but may sacrifice recognition accuracy.
We therefore chose a mobile‑centric recognition solution based on Alibaba’s self‑developed AliNN inference framework, and devised techniques to mitigate the accuracy loss of small models.
3. Recognition Scheme
The goal is to classify any sketch drawing. Sketches lack texture and often contain many visually similar shapes, making classification harder than ordinary image classification.
Traditional approaches extract hand‑crafted features (e.g., SIFT, HOG) and train an SVM classifier, but their accuracy is limited compared with modern deep neural networks.
Recent deep‑learning methods use Convolutional Neural Networks (CNN) or Recurrent Neural Networks (RNN). The CNN approach treats sketches as normalized black‑and‑white images and selects a suitable CNN architecture (e.g., NASNet for high accuracy, MobileNet for mobile performance). The RNN approach incorporates temporal stroke information (x, y coordinates and timestamp) and processes the sequence with 1‑D convolutions, LSTM layers, and a final softmax classification.
Our solution combines the strengths of both: we embed stroke order and timing into RGB images (e.g., the first stroke colored red, later strokes colored with gradients). This enriches the CNN input with temporal cues while keeping inference fast on mobile devices.
Experiments show that the RNN model achieves higher overall accuracy, while the CNN model runs faster on‑device. By merging temporal information into RGB inputs, our hybrid model attains >90% top‑3 accuracy on the test set, a ~5% gain over a pure CNN.
4. Datasets
Public sketch datasets are limited. The most notable are Google’s Quick, Draw! (345 categories, ~50 million sketches) and the TU‑Berlin dataset (250 categories, 80 sketches per class). Quick, Draw! offers abstract, fast‑drawn sketches, whereas TU‑Berlin provides more realistic, time‑intensive drawings.
For the Taobao Live scenario we selected the Quick, Draw! dataset for training, randomly sampling a portion for testing and using the remainder to train our time‑aware CNN model.
5. Future Development
The “Guess‑Draw Treasure” activity attracted over 100 million interactions, demonstrating strong user interest in fun, AI‑driven mobile experiences. Going forward we will continue to improve the AI sketch recognizer to support more object categories and integrate it into our self‑developed PixelAI SDK, which provides on‑device capabilities such as face, gesture, segmentation, SLAM, and more.
PixelAI has already powered other interactive games like “Smile Red Packet” during Double 11 and the AR “World‑Goal” feature during the World Cup, and will serve as the foundation for many future intelligent mobile experiences.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
