Artificial Intelligence 8 min read

PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX

PP‑ShiTuV2, a PaddleX pipeline that integrates subject detection, deep feature encoding, and vector retrieval, delivers 91 % recall@1 on AliProducts, surpasses earlier models by over 20 points, runs efficiently on GPU and CPU, and offers simple installation, quick‑start code, and full fine‑tuning support.

Baidu Geek Talk

Nov 25, 2024

PP-ShiTuV2: A General Image Recognition Pipeline in PaddleX

Image recognition is a fundamental task in computer vision, widely used in face verification, retail product identification, etc. However, deploying such technology faces challenges such as frequent class updates, fine‑grained discrimination, data collection cost, and semantic gaps in open‑domain detection.

PP‑ShiTuV2, integrated in PaddleX, addresses these issues by combining three modules: a subject detection module that extracts all foreground objects, an image‑feature module that encodes detected subjects into deep feature vectors, and a vector‑retrieval module that matches vectors against a feature database.

The system achieves a recall@1 of 91.03 % on the AliProducts dataset and improves over the previous PP‑ShiTuV2_rec model by more than 20 percentage points on an internal open‑domain benchmark. Inference time is measured on NVIDIA Tesla T4 (FP32) and Intel Xeon Gold 5117 (8 threads, FP32).

Compared with the Grounding DINO model, PP‑ShiTuV2 shows superior performance on fine‑grained product and beverage brand recognition, as illustrated by the side‑by‑side visual results.

Installation

# cpu
python -m pip install paddlepaddle==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/
# gpu (CUDA 11.8)
python -m pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/
# gpu (CUDA 12.3)
python -m pip install paddlepaddle-gpu==3.0.0b2 -i https://www.paddlepaddle.org.cn/packages/stable/cu123/

pip install https://paddle-model-ecology.bj.bcebos.com/paddlex/whl/paddlex-3.0.0b2-py3-none-any.whl

Quick start

from paddlex import create_pipeline
pipeline = create_pipeline(pipeline="PP-ShiTuV2")
index_data = pipeline.build_index("drink_dataset_v2.0/", "drink_dataset_v2.0/gallery.txt")
output = pipeline.predict("./drink_dataset_v2.0/test_images/", index=index_data)
for res in output:
    res.print()
    res.save_to_img("./output/")

The demo uses the public drink_dataset_v2.0 (download link provided) to build an index and run predictions.

Fine‑tuning / secondary development

python main.py -c paddlex/configs/general_recognition/PP-ShiTuV2_rec.yaml \
    -o Global.mode=train \
    -o Global.dataset_dir=./dataset/Inshop_examples

Additional command‑line options allow specifying GPU devices (e.g., -o Global.device=gpu:0,1) and training epochs ( -o Train.epochs_iters=10). All hyper‑parameters can be edited in the YAML configuration file.

Overall, PP‑ShiTuV2 provides a ready‑to‑use, high‑performance pipeline for generic image recognition, suitable for both rapid prototyping and production deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision Python Deep Learning Model Deployment image recognition PaddleX PP-ShiTuV2

Written by

Baidu Geek Talk

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.