Using Neo4j to Complement MySQL for Complex Relationship Queries
The article demonstrates how MySQL handles simple relationship queries efficiently but struggles with deep relational traversals, and shows how the graph database Neo4j can replace MySQL in such scenarios, providing faster query execution and better scalability for multi‑level social connections.
MySQL can handle most online business workloads well, but it becomes inefficient when dealing with complex relationship queries such as multi‑level social connections; in these cases a NoSQL graph database like Neo4j can fill the gap.
To illustrate, a simple network of people (Xiao Song, Xiao Li, Xiao Yang, Xiao Ai, Xiao Xu, Xiao Na, Xiao Qiao) is used, and three requirements are defined:
Count the total number of users.
Find users who know Xiao Yang and are also known by Xiao Yang.
Find the fourth‑degree connections starting from Xiao Yang (i.e., the users reached after four "knows" hops).
In MySQL two tables are created: user1 (stores user names) and relation1 (stores pairs of users that know each other). Sample DDL and data insertion are shown below.
mysql:ytt>create table user1 (name varchar(64));
Query OK, 0 rows affected (0.09 sec)
mysql:ytt>insert user1 values ("小宋"),("小李"),("小杨"),("小爱"),("小徐"),("小娜"),("小乔");
Query OK, 7 rows affected (0.01 sec)
mysql:ytt>create table relation1 (user_name varchar(64),friend_name varchar(64));
Query OK, 0 rows affected (0.07 sec)
mysql:ytt>insert relation1 values ("小宋","小李"),("小李","小杨"),("小杨","小爱"),("小爱","小徐"),("小徐","小娜"),("小娜","小乔");
Query OK, 6 rows affected (0.01 sec)The three requirements are then implemented with MySQL queries:
-- Requirement 1: total user count
select count(*) as total_number from user1;
-- Requirement 2: users who know Xiao Yang and are known by Xiao Yang
select * from (
select (case when friend_name = "小杨" then user_name when user_name = "小杨" then friend_name end) as user_name
from relation1
) as sub_table where user_name is not null;
-- Requirement 3: fourth‑degree connections from Xiao Yang
select d.friend_name
from relation1 b
inner join relation1 a on b.user_name = "小杨" and b.friend_name = a.user_name
inner join relation1 c on a.friend_name = c.user_name
inner join relation1 d on c.friend_name = d.user_name;While the first two queries run quickly, the third one becomes increasingly slow as the join depth grows; on a dataset with 1,100 users and 100,000 relationships it took more than four minutes.
Using Neo4j, the same data is modeled as nodes with a KNOWS relationship. The nodes and relationships are created with the following Cypher statements:
neo4j@ytt> create (x1:user {name:"小宋"}),
(x2:user {name:"小李"}),
(x3:user {name:"小杨"}),
(x4:user {name:"小爱"}),
(x5:user {name:"小徐"}),
(x6:user {name:"小娜"}),
(x7:user {name:"小乔"})
with x1,x2,x3,x4,x5,x6,x7
create (x1)-[:认识]->(x2),
(x2)-[:认识]->(x3),
(x3)-[:认识]->(x4),
(x4)-[:认识]->(x5),
(x5)-[:认识]->(x6),
(x6)-[:认识]->(x7);The three requirements are answered with concise Cypher queries:
-- Requirement 1
match (x:user) return count(*) as total_number;
-- Requirement 2
match (y1:user)-[:认识]->(x:user {name:"小杨"})-[:认识]->(y2:user) return y1.name, y2.name;
-- Requirement 3 (four‑hop)
match (x:user {name:"小杨"})-[:认识*4]->(y) return y.name;All three queries execute in less than a second, demonstrating Neo4j’s advantage for deep relationship traversals.
To test scalability, 1,100 users and 100,000 relationships are loaded into MySQL, then imported into Neo4j using LOAD CSV commands and an index on :user(name) is created. The fourth‑degree query in Neo4j completes in under one second, whereas the same query in MySQL required over four minutes.
Conclusion: For scenarios that involve complex, multi‑level relationship processing, graph databases such as Neo4j outperform traditional relational databases like MySQL, offering simpler query syntax and dramatically better performance as data size grows.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.