How Alibaba Built an E‑commerce Knowledge Graph to Power Smarter Search

This article explains Alibaba’s end‑to‑end approach to constructing an e‑commerce knowledge graph—detailing the background, challenges, data‑structuring methods, schema design, modular architecture, and deployment pipeline that enable deep user‑intent understanding across complex shopping scenarios.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba Built an E‑commerce Knowledge Graph to Power Smarter Search

Background

Since June 2017, Alibaba’s e‑commerce cognition graph has evolved from practice to a systematic data‑cognition system, forming a relatively complete e‑commerce data knowledge framework.

With the group’s expanding business scope, the demand for data interconnection has intensified, as it underpins cross‑domain search, recommendation, and interactive experiences.

Problems

Complex shopping scenarios generate data beyond traditional text, including multi‑language, online‑offline integration, and new‑retail contexts.

Large volumes of unstructured internet data are scattered and noisy, making it difficult to capture true user needs.

Multi‑modal and multi‑source data (text, video, images) require effective fusion.

Data is fragmented across departments, leading to duplicated category systems and hindering unified query services.

Lack of deep data cognition prevents understanding of nuanced user intents (e.g., linking “folic acid” queries to pregnancy preparation).

Requirement Analysis

A unified schema and query framework are needed to support global concept representation, data cleaning, phrase mining, information extraction, and hierarchical structuring.

E‑commerce ConceptNet

The goal is to build a comprehensive knowledge system that deeply understands user demand and links people, goods, and scenes.

Module Division

The graph consists of four major parts, constructing an heterogeneous graph of concepts (user, scene, virtual category, item) to associate users, scenes, and products.

User Graph

Beyond basic demographics, the user graph captures group attributes (e.g., “elderly”, “children”) and category preference data.

Scene Graph

Scenes abstract user needs from queries and titles into generalized concepts such as “outdoor barbecue” or “vacation outfit”.

Category Refinement

Category aggregation and splitting address inconsistencies across business lines, creating virtual categories (e.g., “wind‑proof scarf”) to capture fine‑grained user intents.

Product Graph

Phrase mining extracts product attributes, feeding a bootstrap CPV mining loop that continuously expands query and product understanding.

Product labeling links the mined knowledge to items, completing a semantic loop from query to product.

Knowledge System

A global schema representation was designed by studying WordNet and ConceptNet, resulting in the E‑commerce ConceptNet ontology that maps user demands to concepts, defines concept properties, and establishes hierarchical and relational links.

Technical Framework

Platform Modules

The data service middle‑platform supports the graph engine, while the Qianmo platform handles data annotation, review, and visualization. Turing provides business‑level services, and the Graph Engine offers multi‑hop retrieval via Gremlin queries.

Data Storage

MySQL is used for flexible labeling, a graph database for full‑graph queries, and ODPS for versioned data management. Data is split into node and edge tables before ingestion into iGraph and BigGraph.

Technical Deployment

The cloud‑theme knowledge cards have been launched in nearly 10,000 scenarios, improving click‑through and exploration rates. Ongoing work focuses on data diversification and integration with search and recommendation pipelines.

Future Plans

Relation mining and ontology construction.

Linking the graph with external data via text augmentation.

Mining commonsense reasoning rules.

Symbolic logical representation for graph inference.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commerceAIdata modelingKnowledge GraphOntology
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.