Databases 10 min read

JOIN vs IN: Choosing the Faster MySQL Query Method

Through a series of local experiments on MySQL tables with varying data volumes, this article analyzes the performance trade‑offs between JOIN queries and IN‑list queries, detailing execution plans, PHP benchmarking, and the impact of large IN lists on query speed and reliability.

Architect
Architect
Architect
JOIN vs IN: Choosing the Faster MySQL Query Method

Table Structure

Two tables are used: user – columns: id (PK, auto‑increment), name (varchar(64)), gender (smallint), mobile (varchar(11), unique), create_time (datetime). order – columns: id (PK, unsigned, auto‑increment), price (decimal(18,2)), user_id (int), product_id (int), status (smallint, default 0). Indexes on user_id and product_id.

CREATE TABLE `user` (
  `id` int NOT NULL AUTO_INCREMENT,
  `name` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '姓名',
  `gender` smallint DEFAULT NULL COMMENT '性别',
  `mobile` varchar(11) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '手机号',
  `create_time` datetime NOT NULL COMMENT '创建时间',
  PRIMARY KEY (`id`),
  UNIQUE KEY `mobile` (`mobile`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=1005 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

CREATE TABLE `order` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `price` decimal(18,2) NOT NULL,
  `user_id` int NOT NULL,
  `product_id` int NOT NULL,
  `status` smallint NOT NULL DEFAULT '0' COMMENT '订单状态',
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`),
  KEY `product_id` (`product_id`)
) ENGINE=InnoDB AUTO_INCREMENT=202 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;

Small‑Data Scenario (1,000 users, 100 orders)

Goal: retrieve every order together with the corresponding user name. Three approaches are compared:

JOIN : a single SQL statement joining the two tables.

Separate queries : first fetch all rows from order, then fetch the needed users with an IN list built from the returned user_id s.

PHP benchmark : scripts using mysqli and ab -n 100 -c 10 to measure response time.

JOIN

EXPLAIN FORMAT=JSON SELECT `order`.id, price, `user`.name FROM `order` JOIN `user` ON `order`.user_id = `user`.id;

The explain output shows a single driver table; the join cost grows linearly with the number of rows returned.

Separate Queries

SELECT `id`, price, user_id FROM `order`;
SELECT `id`, `name` FROM `user` WHERE id IN (8,11,20,...,995);  -- IN list contains all user_ids from the previous result

Because the IN list is short, the second query’s cost is comparable to the join.

PHP Benchmark

Both methods are implemented in PHP. The benchmark runs 100 requests with 10 concurrent connections (ApacheBench). The separate‑query version builds the IN list dynamically, executes two queries, then merges the results in PHP.

$mysqli = new mysqli('127.0.0.1','root','root','test');
if ($mysqli->connect_error) { die('Connect Error ('.$mysqli->connect_errno.') '.$mysqli->connect_error); }
// Separate queries version
$result = $mysqli->query('SELECT `id`,price,user_id FROM `order`');
$orders = $result->fetch_all(MYSQLI_ASSOC);
$userIds = implode(',', array_column($orders, 'user_id'));
$result = $mysqli->query("SELECT `id`,`name` FROM `user` WHERE id IN ({$userIds})");
$users = $result->fetch_all(MYSQLI_ASSOC);
// Build associative map of user id → name
$userMap = [];
foreach ($users as $u) { $userMap[$u['id']] = $u['name']; }
// Merge
$res = [];
foreach ($orders as $o) {
    $res[] = [
        'id'    => $o['id'],
        'price' => $o['price'],
        'name'  => $userMap[$o['user_id']] ?? ''
    ];
}
var_dump($res);
$mysqli->close();

The join version replaces the two‑step process with a single query:

$mysqli = new mysqli('127.0.0.1','root','root','test');
$result = $mysqli->query('SELECT `order`.id, price, `user`.name FROM `order` JOIN `user` ON `order`.user_id = `user`.id');
$orders = $result->fetch_all(MYSQLI_ASSOC);
var_dump($orders);
$mysqli->close();

Benchmark results show the join method completes noticeably faster than the separate‑query approach.

Medium‑Data Scenario (10,000 users, 10,000 orders)

The same three approaches are repeated with ten‑thousand rows in each table. Query plans (EXPLAIN) remain similar: the join still uses a single driver table, while the separate‑query path incurs the same DB cost plus PHP‑level merging overhead. Timing measurements confirm the join is consistently faster.

Large‑Data Scenario (100,000 users, 1,000,000 orders)

When the IN list grows to thousands of values, the separate‑query approach starts to fail. MySQL rejects overly long IN clauses or forces the client to split the list into multiple statements, which dramatically increases total execution time. The join query continues to run without hitting the length limit and remains the faster option.

Conclusions

On the author’s local machine, 100,000 rows already constitute a heavy load; larger data sets may cause the system to freeze.

With small data sets, a JOIN is both cheaper (fewer round‑trips) and faster.

With larger data sets, the cost of a JOIN rises, but it still tends to be faster than fetching rows separately and stitching them in application code.

When the IN list becomes very large, MySQL may refuse to execute the statement (“SQL too long” error), forcing the developer to split the list, which adds overhead and breaks pagination.

Source: https://juejin.cn/post/7169567387527282701

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceSQLmysqlDatabase OptimizationPHPJOININ
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.