Artificial Intelligence 6 min read

Mobile Image Search: Algorithm Framework and Implementation at Paizhi Tao

Mobile image search has become a critical user demand, and since its 2014 launch, Alibaba’s Paizhi Tao has evolved through multiple iterations to a robust AI-driven pipeline comprising category prediction, object detection, deep and local image feature extraction, scalable retrieval indexing, and relevance-based ranking.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Mobile Image Search: Algorithm Framework and Implementation at Paizhi Tao

Abstract: Mobile image search has been a long‑standing goal for both academia and industry. Since its official project start in 2014, Paizhi Tao has refined its algorithms, engineering, and product experience, achieving industry‑leading performance while driving business growth.

1. Why Build It and Why Now

1.1 Why

The image‑search feature, especially on mobile, is becoming an increasingly important traffic entry and user need. Studies indicate that within five years, more than 50% of user intent will be expressed via voice or images. A significant portion of monthly feedback from Mobile Taobao users explicitly requests image‑search capabilities.

1.2 Why Now

1. Ubiquitous mobile devices equipped with cameras. 2. The deep‑learning era: since 2013, deep learning has achieved great success in image, speech, and NLP tasks in industry. 3. Widespread large‑scale computing platforms such as ODPS and Amazon Cloud. 4. Mobile e‑commerce growth makes “photo‑to‑buy” a natural user demand, and the abundant user‑generated data continuously improves relevance.

2. Algorithm Framework

Paizhi Tao first launched on Mobile Taobao in 2014 with a small entry point and limited functionality. After several versions of iteration and exploration, a stable algorithm framework was formed, as shown below:

The framework consists of five modules: category prediction, object detection, image feature extraction, retrieval indexing, and ranking. Relevance is mainly affected by category prediction, object detection, feature extraction, and ranking, while the retrieval index focuses on scalability.

2.1 Category Prediction

Because raw visual features have limited discriminative power across categories and searching the entire catalog is inefficient, we first predict the product category to narrow the search space. Paizhi Tao currently handles over ten major categories covering tens of thousands of leaf categories.

2.2 Object Detection

Product images often have complex backgrounds and small foreground objects. To reduce background interference and handle multiple objects, we extract the main object from the query image. The following two pictures illustrate the search result differences with and without object detection.

2.3 Image Features

Image features include deep features (CNN‑based) and local features. CNN extracts high‑level semantic representations, bridging the semantic gap, while local features capture fine‑grained details and complement the deep features.

2.4 Retrieval Index

The search process consists of offline and online stages. Offline, we extract image features for all products and build the index. Online, we extract features from the query and perform fast lookup on a distributed engine.

2.5 Ranking

Based on multiple image and non‑image features, we apply various optimization functions to re‑rank the retrieved results, improving relevance and user satisfaction.

Source: Cloudwise Team Blog Original article: https://yq.aliyun.com/articles/3225#

deep learningObject Detectionmobile AIImage Searchretrieval
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.