Building and Scaling Ant Group's Merchant Knowledge Graph: Architecture, Construction, Fusion, and Open Platform

This article describes Ant Group's merchant knowledge graph, covering its background, overall architecture, data sources, schema design, processing pipeline, cross‑graph fusion techniques, cognitive applications such as vector‑based recall, and the upcoming open‑source and SDK initiatives aimed at sharing the knowledge and tools with the community.

DataFunTalk
DataFunTalk
DataFunTalk
Building and Scaling Ant Group's Merchant Knowledge Graph: Architecture, Construction, Fusion, and Open Platform

Ant Group has accumulated massive merchant data across offline stores, online mini‑programs, and brand interactions, and faces challenges of sparse basic information, heterogeneous multi‑source data, and the need for a rich, precise, and efficient merchant knowledge infrastructure.

Merchant Knowledge Graph Overview : The graph connects merchants and users, integrating offline POS data, online mini‑programs, and various third‑party channels, providing a unified view of merchant types, locations, services, and their relationships to user behaviors.

Construction & Fusion :

Data sources include structured business data (ODPS, MySQL), semi‑structured info‑boxes, and unstructured text/images.

Knowledge modeling defines a schema with concept, entity, and event domains, capturing abstract concepts, concrete merchant entities, and user behavior events.

The processing pipeline (DAG) performs data ingestion, cleaning, entity/relationship extraction, schema mapping, entity linking and normalization, quality inspection, and storage.

Storage is layered to support real‑time queries, platform operations, and large‑scale graph computation.

Knowledge operation enables query, editing, visualization, and large‑scale graph embedding and reasoning.

Cross‑graph fusion combines merchant graphs with other domain graphs (e.g., credit, security) by aligning identical entities and linking different entity types, achieving knowledge reuse, reducing data duplication, and accelerating business value.

Cognitive Applications : A dual‑tower vector recall model incorporates dynamic graph embeddings (using a proprietary delta‑graph encoder) to improve coupon recommendation, mitigating popularity bias and enhancing coverage of long‑tail items.

Open Platform :

Open knowledge releases concept‑level data such as intents, brands, categories, and tags.

Three public tasks are released: multi‑intent classification for mini‑programs, hierarchical concept prediction, and brand entity recognition.

SDKs (Java and Python) provide data integration, graph storage access, schema management, and algorithm deployment, enabling developers to build, query, and reason over graphs easily.

In summary, Ant Group's merchant knowledge graph now contains over 30 billion entities and 400 billion relationships, and the team plans to progressively open the data and tooling to the community to foster further research and application.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIopen sourceKnowledge Graphgraph algorithmsmerchant data
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.