Why CTEs Can Kill SSIS Performance and How to Fix It
A developer encountered severe slowdown when joining a remote server query via a CTE in SSIS, analyzed the underlying CTE and Nested Loops behavior, and resolved the issue by materializing the remote data into a temporary table, achieving a dramatic performance boost.
Background
During SSIS development a query used a Common Table Expression (CTE) that referenced a remote linked server. The CTE was left‑joined to local data, causing severe performance degradation.
Problem
The execution plan showed the remote part of the query consuming about 99 % of the total cost, and the query took more than half an hour for ~70 000 rows.
CTE behavior in SQL Server
CTE is an ad‑hoc view; it must be followed by a SELECT/INSERT/UPDATE/DELETE and can be referenced only once in the statement.
The WITH clause defines a logical result set that is materialized in tempdb during execution.
CTE result sets have no indexes, constraints, or dedicated statistics unless the underlying tables provide them.
Because statistics are absent, the optimizer may produce inaccurate cardinality estimates.
Impact of Nested‑Loop join
The LEFT JOIN between the local table and the remote CTE was implemented with a Nested Loops operator. For each outer row the engine issued a separate request to the linked server, leading to a geometric increase in remote calls.
Solution: materialize remote data
Create a temporary table (or table variable) and insert the remote result set into it.
Join the local data to the temporary table instead of the remote CTE.
-- Step 1: materialize remote data
SELECT *
INTO #RemoteData
FROM OPENQUERY([LinkedServer], 'SELECT Col1, Col2 FROM RemoteDB.dbo.RemoteTable');
-- Step 2: join with local table
SELECT l.*, r.Col1, r.Col2
FROM dbo.LocalTable AS l
LEFT JOIN #RemoteData AS r
ON l.Key = r.Key;After the change the execution plan shows a simple scan of the temporary table followed by a hash or merge join, eliminating the costly remote loops.
Performance improvement
Runtime dropped from ~30 minutes to under 20 seconds (≈ 90× faster).
CTE vs. temporary table characteristics
Temporary tables are physical objects stored in tempdb and can be indexed.
They can have constraints (PRIMARY KEY, UNIQUE, CHECK).
They exist for the duration of the session and are automatically dropped.
SQL Server automatically gathers statistics on temporary tables, improving cardinality estimates.
When not to use CTEs
Large result sets that need to be reused multiple times.
Cross‑server queries or other long‑running operations.
Joins on large tables without supporting indexes.
Scenarios requiring repeated access to the same intermediate result.
Complex sub‑queries that benefit from explicit indexing or statistics.
Common misconceptions
SQL Server does not support a MATERIALIZE hint; that hint belongs to other database engines.
CTEs are not universally slower; performance depends on the query shape and data volume.
CTE results are stored in tempdb, not purely in memory, so I/O can be a factor.
Conclusion
CTEs in SQL Server act as single‑use, in‑memory‑like views without indexes or statistics. When combined with a Nested Loops join to a linked server, they can cause extreme performance problems. Materializing the remote data in a temporary table resolves the issue and provides a clear guideline for when to prefer temp tables over CTEs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
