How Suning Uses Naive Bayes for High‑Accuracy Product Classification

This article explains Suning's implementation of a Naive Bayes‑based product classification system, detailing its basic theory, formal definition, step‑by‑step training process, three implementation phases, evaluation results, and error analysis to improve classification accuracy.

Suning Technology
Suning Technology
Suning Technology
How Suning Uses Naive Bayes for High‑Accuracy Product Classification

Common product classification methods include Support Vector Machines, K‑Nearest Neighbors and Naive Bayes. Naive Bayes is easy to implement and fast, so Suning Search R&D Center applied it to classify external‑site product information and user queries.

Basic Idea of Naive Bayes

For a given item to be classified, compute the probability of each class under the condition that the item appears; the class with the highest probability is chosen.

Formal Definition

Three Key Steps

1) Find a known‑labelled training set (the training sample set). 2) Estimate the conditional probabilities of each feature under each class. 3) Assuming feature independence, derive the posterior probability using Bayes' theorem.

Classification Process

Phase 1: Preparation

Build the training set from Suning’s product data covering major categories such as large appliances, apparel, cosmetics, food, sports, and automotive.

Phase 2: Classifier Training

Generate the classifier by tokenizing product titles, performing word segmentation, and analyzing word‑category distributions to link words with categories.

Phase 3: Application

Classify external‑site products into Suning’s multi‑level taxonomy. First attempt breadcrumb matching; if it fails, use Naive Bayes prediction. Tokenize the product name, form bi‑grams or tri‑grams, compute the probability of each category, and select the highest.

Evaluation: Using 5,000 external products as a test set, the top‑1 accuracy reached 92.5 % and the top‑3 accuracy 98.9 %. Error analysis shows misclassifications mainly in third‑level categories or gender‑specific items.

The automatic classification system is now widely used for category setting, error detection, and external data analysis, significantly improving product data quality.

algorithmmachine learningtext classificationNaive BayesSuningproduct classification
Suning Technology
Written by

Suning Technology

Official Suning Technology account. Explains cutting-edge retail technology and shares Suning's tech practices.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.