Databases 6 min read

Why Hash Join Beats Nested Loop Join and When It Fails

This article explains why hash joins usually outperform nested‑loop joins, how to force hash joins in SQL, the data‑type restrictions that prevent hash joins, and practical tips for optimizing join performance in TD and Oracle compatibility modes.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Why Hash Join Beats Nested Loop Join and When It Fails

1. Hash join usually outperforms nested‑loop join

Nested‑loop joins have O(N²) complexity, while hash joins run in O(N), so hash joins are generally preferred.

During SQL tuning you can force a hash join in two ways:

Disable nested‑loop joins at the session level: set enable_nestloop to off; Use a hint in the query: /*+ hashjoin(a b) */ to make tables a and b use a hash join.

CREATE DATABASE test_td WITH DBCOMPATIBILITY='td';
create table dim_day(day_code char(8));
create table dwr_rpo as select current_date - 1 as day_code; -- returns date type

explain select *
from dwr_rpo a
left join dim_day c on c.day_code = a.day_code;

-- Sample execution plan (simplified)
1 | Streaming (type: GATHER)                     | 1310148 rows
2 | Nested Loop Left Join (3, 4)                 | 1310148 rows, 1 MB memory
3 |   Seq Scan on dwr_rpo a                      | 1310148 rows, 1 MB memory
4 |   Materialize                               | 109575 rows, 16 MB memory
5 |   Streaming (type: BROADCAST)                | 109575 rows, 2 MB memory
6 |   Seq Scan on dim_day c                      | 36525 rows, 1 MB memory

Even with these settings, the query may still not use a hash join because the data types on both sides must support hash comparison.

Why hash join sometimes cannot be used

Different data types compute hash functions differently; incompatible types cannot be hashed together.

Performance gap illustration

Nested‑loop join complexity: 131 million × 10 million = 1.31 trillion operations.

Hash join complexity: roughly 131 million operations.

The difference explains why a hash join can finish in seconds while a nested‑loop join may take hours.

Why type conversion may still prevent hash join

Even if types appear similar, differences in precision, format, or time‑zone handling make them non‑compatible for hash comparison.

Data types that do not support hash joins

select oprname, oprkind, oprcanhash,
       (select typname from pg_type where oid=oprleft)  as oprleft,
       (select typname from pg_type where oid=oprright) as oprright
from pg_operator
where oprname='=' and oprcanhash='f';

-- Sample result (partial)
oprname | oprkind | oprcanhash | oprleft | oprright
---------------------------------------------------
=       | b       | f          | xid     | int8
=       | b       | f          | xid32   | int4
=       | b       | f          | date    | timestamp
=       | b       | f          | date    | timestamptz
=       | b       | f          | timestamp | date
=       | b       | f          | timestamptz | date
=       | b       | f          | timestamp | timestamptz
=       | b       | f          | timestamptz | timestamp

In practice, joins between timestamp, timestamptz, and date cannot use hash joins; other types are rarely encountered.

Development tip: Keep the data types on both sides of a join as consistent and compatible as possible.

Why Oracle compatibility mode works but TD compatibility does not

In TD compatibility mode, current_date is of type date, while in Oracle compatibility mode it is of type timestamp, leading to the incompatibility described above.

(Copyright belongs to the original author, please delete if infringed.)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLquery optimizationData TypesNested LoopHash Join
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.