Artificial Intelligence 18 min read

Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D

The article presents Ctrip's travel AI team's end‑to‑end entity linking solution built on a large‑scale tourism knowledge graph, detailing its background, technical architecture, core modules—including mention detection, candidate generation, and disambiguation using BERT and prefix‑tree techniques—and real‑world applications such as search, intelligent客服, and POI data maintenance.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Entity Linking System for Travel Knowledge Graph at Ctrip AI R&D

The Ctrip travel AI R&D team develops AI products for the travel division, focusing on a knowledge‑graph‑driven entity linking service that improves search, Q&A, and information extraction for tourism.

Background : Rapid growth of web data and linguistic ambiguity (polysemy, synonymy) make it hard to retrieve correct information. In tourism, points of interest (POI) – name, address, coordinates, category – are central, and linking textual mentions to POI entities is crucial.

Problem Analysis : Entity linking consists of mention detection, candidate generation, and candidate disambiguation. Traditional methods include dictionary‑based and statistical models (HMM, CRF), while modern approaches use neural networks (CNN, RNN, Transformer) and pretrained models such as BERT.

Travel Knowledge Graph : POI data are stored in Neo4j and Nebula graph databases, with a schema covering 18 entity types and 12 relation types, totaling about 10 million entities and 37 million triples. Automatic update and monitoring pipelines keep the graph fresh.

Technical Solution : The system follows a three‑stage pipeline. Mention detection combines a neural model (BERT‑based pointer network) with a prefix‑tree of all alias strings to achieve high‑recall coverage. Candidate generation retrieves POI candidates via alias‑to‑entity relationships, applying path‑based filtering. Disambiguation uses a BERT‑based interactive semantic‑matching model that concatenates the query and candidate description, extracts CLS, head, and tail token embeddings, and predicts a linking probability via a sigmoid‑activated linear layer.

Engineering optimizations include Redis caches for alias‑to‑entity‑id and entity‑attribute mappings, reducing latency from graph database queries.

Functional Modules :

5.1 Mention Detection – neural network + prefix‑tree.

5.2 Candidate Generation – alias‑driven graph lookup with rule‑based path filtering.

5.3 Candidate Disambiguation – BERT‑based interaction model trained with binary cross‑entropy loss.

Practical Scenarios :

Travel search – resolves ambiguous POI queries (e.g., "武汉东湖") to the correct entity.

Intelligent客服 – improves slot‑filling F1 by >12%.

POI key‑information updates – raises extraction accuracy by ~6%.

Duplicate POI and hierarchical POI detection – achieves >90% correctness.

Conclusion & Outlook : The system demonstrates the value of integrating entity linking with a tourism knowledge graph, and future work will explore tighter coupling with retrieval and ranking, lightweight models, and broader scenario coverage.

graph databaseNLPknowledge graphBERTsemantic matchingentity linkingtravel AI
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.