Optimizing Chinese Train Transfers with Graph Databases: A Step‑by‑Step Guide
This tutorial shows how to model nationwide train schedules as a graph, prepare CSV data, import it into Huawei Cloud GES, and write Cypher queries that discover faster, more flexible transfer routes than the official 12306 service, complete with Python code examples and visual results.
Background
During the National Day holiday many direct train tickets are sold out; 12306 offers limited transfer options, often requiring long layovers. Train stations and schedules naturally form a graph, so using a graph database can find optimal transfer routes.
Data Preparation
Train schedule data (e.g., 列车时刻表.txt ) is downloaded from public sources. The raw files are parsed in Python to extract stations, stops, and trains, then written to CSV files for vertices ( station.csv, train.csv, stop.csv) and edges ( next.csv, arrive_at.csv).
mkdir -p graph_data/edge
mkdir -p graph_data/vertex
def read_files(path, skip_comment=False, skip_header=False):
space_read = True if not skip_comment else False
header_read = True if not skip_header else False
with open(path, 'r') as f:
for line in f.readlines():
if not header_read:
header_read = True
continue
if not space_read:
if line == "
":
space_read = True
continue
yield lineSchema Definition
The graph model defines three node types: Station , Stop (a train’s arrival at a station) and Train . Edges NEXT connect consecutive stops of the same train, and ARRIVE_AT link stops to stations.
Importing into GES
The CSV files are uploaded to OBS, a GES instance is created (choose at least a million‑edge specification), and the data is imported via the GES console or API.
Querying Transfer Routes
Cypher queries can retrieve direct trains, multi‑stop routes, and transfer options. Example: find trains from Nanjing South to Taiyuan with a single transfer, enforce a minimum 15‑minute layover, and limit results.
match (n:Station)<-[:ARRIVE_AT]-(s:Stop) where id(n) in ['南京南']
match p=(s)-[:NEXT*1..30]->(s1) where s1.station contains '太原'
return s.trainNo as `车次`,
subString(toString(s.arrives),11) as `出发`,
subString(toString(s1.departs),11) as `到达`,
subString(toString(datetime(timestamp(s1.departs)-timestamp(s.arrives))),11) as `耗时`,
[x in nodes(p)|x.station] as `途径`
limit 10Additional queries filter by departure time (e.g., before 11 am) or restrict transfers to specific stations such as 德州.
Conclusion
Compared with the 12306 website, graph‑database queries provide more transfer alternatives and shorter travel times, enabling travelers to select the best itinerary based on ticket availability and personal preferences.
Huawei Cloud Developer Alliance
The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
