Performance Comparison of JOIN vs IN in MySQL Queries
This article experimentally compares the performance of JOIN and IN approaches for retrieving order and user data in MySQL, showing that JOIN is faster on small datasets while IN may become costly with large data volumes, and discusses code implementations and practical conclusions.
1. Table Structure
Two tables are created: user (id, name, gender, mobile, create_time) and order (id, price, user_id, product_id, status).
CREATE TABLE `user` (
`id` int NOT NULL AUTO_INCREMENT,
`name` varchar(64) NOT NULL COMMENT '姓名',
`gender` smallint DEFAULT NULL COMMENT '性别',
`mobile` varchar(11) NOT NULL COMMENT '手机号',
`create_time` datetime NOT NULL COMMENT '创建时间',
PRIMARY KEY (`id`),
UNIQUE KEY `mobile` (`mobile`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1005 DEFAULT CHARSET=utf8mb4; CREATE TABLE `order` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`price` decimal(18,2) NOT NULL,
`user_id` int NOT NULL,
`product_id` int NOT NULL,
`status` smallint NOT NULL DEFAULT '0' COMMENT '订单状态',
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`),
KEY `product_id` (`product_id`)
) ENGINE=InnoDB AUTO_INCREMENT=202 DEFAULT CHARSET=utf8mb4;2. Small‑Data Experiment
Insert 1,000 random users and 100 random orders, then compare three methods for fetching all orders with their user names:
JOIN query
Separate queries using IN
PHP script that first selects orders, builds an IN list of user IDs, then queries users.
SQL examples:
explain format=json select order.id, price, user.`name` from `order` join user on order.user_id = user.id; select `id`,price,user_id from `order`;
select `name` from user where id in ( ...list of ids... );Benchmark using ab -n 100 -c 10 shows the JOIN version completes faster than the IN version for this data size.
3. Medium‑Data Experiment
Increase both tables to 10,000 rows each and repeat the three methods. Screenshots indicate that JOIN still outperforms the IN approach, though the gap narrows.
4. Large‑Data Experiment
Scale up to 100,000 users and 1,000,000 orders. The JOIN query remains faster, but the IN query hits SQL length limits, causing execution failures and requiring query splitting.
5. Conclusions
When the dataset is small, JOIN offers both lower cost and higher speed. For large datasets, IN can become expensive due to long query strings and may exceed SQL length limits, while JOIN maintains better performance despite higher per‑row cost. The author suggests looking for alternative strategies beyond simple JOIN or IN.
6. Promotional Note
The author invites readers to join a knowledge‑sharing community, offering ChatGPT accounts, training materials, and other benefits, and includes several links to related resources and paid offerings.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.