How Knowledge Graphs and GNNs Boost HS Code Classification Accuracy

This article explores how integrating unstructured business data into structured knowledge graphs and applying graph neural networks can overcome deep‑learning bottlenecks in NLP, dramatically improving HS‑code product classification accuracy from around 60% to over 75% through richer reasoning and multimodal knowledge.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Knowledge Graphs and GNNs Boost HS Code Classification Accuracy

1. Background

Natural language is a symbolic description of the world; NLP processes highly abstract, discrete symbols, which makes deep learning face bottlenecks in reasoning and cognition. To break this limitation, the article proposes integrating unstructured business data into structured knowledge, using knowledge graphs as infrastructure and graph neural networks (GNN) for reasoning.

2. Knowledge Graph

Knowledge is distilled from meaningful data; a knowledge graph stores, represents, extracts, fuses, and reasons over knowledge. Building a KG requires six components: schema modeling, acquisition, fusion, storage, model mining, and application.

3. Graph Neural Network (GNN)

Graph data is irregular, making traditional convolution ineffective. GNNs, especially Graph Convolutional Networks (GCN), use a message‑passing framework composed of AGGREGATE, COMBINE, and READOUT steps. The basic formulas are illustrated below.

3.1 GCN Basic Principle

GCN consists of multiple graph‑convolution layers that aggregate first‑order neighbor information. The three core equations are:

3.2 AGGREGATE

AGGREGATE computes the layer‑wise feature aggregation by multiplying the adjacency matrix A with the node feature matrix X, then applying a weight matrix W and activation σ:

3.3 COMBINE

COMBINE concatenates the aggregated vector with the previous layer’s representation and passes the result through a dense (fully‑connected) layer.

3.4 READOUT

READOUT generates a graph‑level representation. Simple statistical methods (sum, max, average) are easy but lose information; learned pooling methods such as DIFFPOOL can preserve hierarchical structure.

4. Application to HS‑Code Product Classification

HS‑Code classification is a strict NLP task requiring precise reasoning. Traditional deep‑learning NLP models achieve only 59.3% accuracy because of noisy inputs, missing domain knowledge, and inability to perform logical calculations. By constructing a domain‑specific knowledge graph and applying a GCN, accuracy rises to 76%, a 16.7% improvement.

The KG schema models product names, declaration attributes, and their relationships; the GCN ingests word2vec‑based node embeddings and heterogeneous edge types, then performs message‑passing and READOUT to predict the correct HS‑code.

5. Experiments

Two experiments were conducted:

Comparing sum vs. average READOUT strategies for graph‑level pooling.

Comparing a simple star‑shaped KG (only product‑attribute edges) with a complex heterogeneous KG (additional attribute‑attribute and value‑attribute edges).

Results show that richer graph structures and more expressive READOUT improve classification accuracy.

6. Future Directions

How to extract and fuse large rule‑bases into knowledge graphs.

Combining rule‑based teacher networks with GNN student networks to guide learning.

Building multimodal knowledge graphs that incorporate images, audio, and other data types.

References

Inductive Representation Learning on Large Graphs, https://arxiv.org/abs/1706.02216

Hierarchical Graph Representation Learning with Differentiable Pooling, https://arxiv.org/abs/1806.08804

https://www.cnblogs.com/SivilTaram/p/graph_neural_network_3.html

https://zhuanlan.zhihu.com/p/68064309

https://zhuanlan.zhihu.com/p/37057052

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AINLPGNNKnowledge GraphGraph Neural NetworkHS code classification
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.