Artificial Intelligence 22 min read

AliCoCo: Alibaba’s E‑commerce Cognitive Concept Net – Architecture, Construction, and Applications

The article presents AliCoCo, Alibaba’s large‑scale e‑commerce knowledge graph that models user demand as concepts, describes its four‑layer architecture, the algorithms for concept extraction, taxonomy building, and item association, and demonstrates its impact on search and recommendation systems.

DataFunTalk
DataFunTalk
DataFunTalk
AliCoCo: Alibaba’s E‑commerce Cognitive Concept Net – Architecture, Construction, and Applications

AliCoCo (Alibaba E‑commerce Cognitive Concept Net) is a knowledge graph designed to represent user shopping demands as explicit concepts, enabling more intelligent e‑commerce experiences.

The system consists of four layers: E‑commerce Concepts (short user‑need phrases), Primitive Concepts (fine‑grained word tokens), a Taxonomy (hierarchical classification of primitive concepts), and the Item layer (billions of products linked to concepts).

To build the taxonomy, experts defined about 20 top‑level classes (e.g., Category, Function, Material) and refined them into millions of leaf nodes, providing a rich ontology comparable to Freebase or ConceptNet but with extensive concept instances.

Primitive concepts are extracted using a BiLSTM‑CRF model trained on large e‑commerce corpora, followed by crowdsourced verification. Hypernym (up‑down) relations are discovered via pattern‑based unsupervised methods and projection‑learning supervised models, with active learning to reduce labeling cost.

E‑commerce concepts are generated in two stages: massive candidate generation from text mining (AutoPhrase) and combinatorial generation from primitive concepts, then filtered by a knowledge‑enhanced Wide&Deep model that incorporates BiLSTM features, POS/NER tags, Wikipedia gloss embeddings, and statistical BERT perplexity scores.

Linking concepts to items is treated as a semantic matching problem; the model enriches item representations with associated primitive concepts and external Wikipedia knowledge to improve relevance, especially for short concept queries.

AliCoCo has been deployed in Alibaba’s core e‑commerce services. Over 98% of Taobao/Tmall items are covered, each linked to an average of 14 primitive concepts and 135 e‑commerce concepts, raising query coverage from 35% to 75% and powering knowledge cards in search as well as theme‑based recommendations.

Future work includes expanding commonsense relations, probabilistic modeling of concept‑item links, and multilingual extensions to support Alibaba’s global strategy.

Alibabae‑commercerecommendationNLPknowledge graphconcept extraction
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.