Machine Learning Applications for Product Data Quality and Knowledge Graph Construction at JD.com

At the 2nd China Big Data International Summit 2017, JD’s chief architect presented how machine‑learning techniques are applied across e‑commerce to improve product data quality, ensure compliance, resolve image‑text mismatches, automate category identification, restructure titles, and build a multi‑dimensional product knowledge graph.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Machine Learning Applications for Product Data Quality and Knowledge Graph Construction at JD.com

During the 2nd China Big Data International Summit 2017 in Shanghai, JD’s chief architect He Xiaofeng shared the company’s extensive use of machine‑learning methods to extract commercial value from massive product data.

The talk covered several key scenarios: using ML to clean and verify product information, detecting prohibited content with models for pornographic image detection, price OCR, semantic understanding of forbidden words, and adaptive QR‑code detection; applying fully convolutional networks for end‑to‑end image layout analysis to separate text from background.

To address inconsistencies between product images and textual attributes, JD built a text‑image mismatch verification system that leverages a curated attribute dictionary and supervised learning to align titles, sales attributes, and extended attributes, while also extracting visual features from product images for high‑confidence labeling.

For product titles, JD developed a title‑attribute understanding and re‑composition pipeline that reduces keyword stuffing, improves display completeness, and offers compliance services during upload.

Automatic category recognition was achieved by enhancing a CBOW‑based model (BTC) with dropout and training tricks, enabling the system to recommend correct categories from merchant‑provided titles, achieving a 99% classification accuracy during large‑scale promotional events.

JD also constructed a multi‑dimensional product knowledge graph by extracting information from product detail pages, user reviews, and customer service chats, using OCR to capture text from images, and applying supervised and unsupervised techniques to mine key phrases, sentiment, and functional attributes from reviews.

These machine‑learning applications collectively improve data quality, search accuracy, user experience, and enable the creation of a rich product knowledge ecosystem that drives commercial value for JD.com.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Artificial Intelligencemachine learningData Qualityknowledge graph
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.