Why Alibaba’s Java Development Manual Bans Joins Over Three Tables – Experiments, Analysis, and Oracle Comparison
The article investigates the Alibaba Java development guideline that forbids joining more than three tables by designing MySQL and Oracle experiments, generating massive test data, measuring query performance, analyzing index usage, and concluding why the rule exists and when it can be relaxed.
The author starts by quoting the Alibaba Java Development Manual’s rule that joins involving more than three tables are prohibited, and asks why this restriction exists.
Experiment environment : VMware 10, CentOS 7.4, MySQL 5.7.22 (2 GB buffer, SSD) and later Oracle on a similar machine.
Test schema : Four tables – student , teacher , course , and sc (student‑course relationship). The goal query retrieves the name and score of the highest‑scoring student for a specific teacher (tname553).
SQL used :
select Student.Sname,course.cname,score
from Student,SC,Course,Teacher
where Student.s_id=SC.s_id
and SC.c_id=Course.c_id
and sc.t_id=teacher.t_id
and Teacher.Tname='tname553'
and SC.score=(select max(score) from SC where sc.t_id=teacher.t_Id);The author decomposes the query into three simpler statements to illustrate the logic.
Data generation scripts create massive datasets: up to 50 million rows for student , 1 000 rows for course , and 50 million rows for sc . Functions are written in MySQL stored procedures and Oracle PL/SQL to insert data in batches, with indexes added on join keys ( idx_s_id , idx_t_id , idx_c_id ).
Results on MySQL :
When join keys lack indexes, queries become extremely slow, confirming the need for proper indexing.
With simple queries on 1 billion‑row tables, MySQL still returns results but performance degrades noticeably.
Four‑table joins on 150 million rows exceed MySQL’s practical limits on the author’s hardware.
The author observes a “performance threshold” around 15 million rows for smooth four‑table joins.
Oracle comparison :
Running the same four‑table join on Oracle with similar data volume finishes in about 26 seconds even without indexes, demonstrating Oracle’s stronger join processing capabilities.
Conclusion :
The Alibaba rule originates from MySQL’s difficulty handling large multi‑table joins; with smaller tables the rule is unnecessary, but for high‑concurrency, large‑scale systems it is safer to keep joins under three tables and push complex logic to the application layer.
The article also provides complete DDL and data‑generation scripts for MySQL and Oracle, allowing readers to reproduce the experiments.
IT Xianyu
We share common IT technologies (Java, Web, SQL, etc.) and practical applications of emerging software development techniques. New articles are posted daily. Follow IT Xianyu to stay ahead in tech. The IT Xianyu series is being regularly updated.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.