Databases 13 min read

How Parallel Query Engines Accelerate Cloud Databases with Cost‑Based Optimization

This article explains how cloud databases use cost‑based parallel query engines to overcome I/O bottlenecks, select tables for parallel scans, implement parallel joins and aggregations, and achieve linear speedups on TPCH benchmarks, highlighting the key techniques and architectural components.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Parallel Query Engines Accelerate Cloud Databases with Cost‑Based Optimization

Background

As data volumes grow, SQL execution times increase, challenging traditional databases. Even with added resources in the cloud, performance may not improve because resources are under‑utilized. Parallel query engines in cloud databases aim to fully exploit system resources to accelerate SQL.

How to Parallelize Queries

For OLAP‑style queries that scan large data sets exceeding memory, I/O becomes the bottleneck. Parallel I/O can be achieved by having multiple worker threads each read a partition or slice of data, either by existing partitions or round‑robin slicing. Parallel reading must be coupled with parallel processing to avoid buffer overflow.

Choosing Tables for Parallel Scan

Cost‑based estimation selects candidate tables for parallel scan, often the largest fact tables. The optimizer must evaluate whether parallelism yields net benefit, avoiding cases where overhead exceeds gains.

Parallel Scan of Multiple Tables and Parallel JOIN

When multiple large fact tables are involved, each can be scanned in parallel by a pool of workers. Parallel HASH JOIN can be realized either by partitioning both tables on the hash key (so matching rows reside in the same worker) or by building a shared hash table and having each worker probe its slice. The optimizer chooses the method based on cost.

Parallelizing Complex Operators

GROUP BY, ORDER BY and LIMIT are the most expensive operators after joins. By repartitioning join results on the GROUP BY key, each worker can perform GROUP BY locally, then a collector merges results. LIMIT can be pushed down to workers to reduce data transfer.

Linear Acceleration on TPCH

Benchmarks on the TPCH suite show that 100 % of queries benefit from parallel execution, with 70 % achieving more than 8× speedup and overall acceleration close to 13×; some queries (Q6, Q12) exceed 32×.

Conclusion

Optimizers are the core of database performance; adding cost‑based parallelism to existing optimizers, as done in PolarDB, yields substantial speedups while maintaining stability.

SELECT c.c_name, sum(o.o_totalprice) as s
FROM customer c, orders o
WHERE c.c_custkey = o.o_custkey
  AND o_orderdate >= '1996-01-01'
  AND o_orderdate <= '1996-12-31'
GROUP BY c.c_name
ORDER BY s DESC
LIMIT 10;
SELECT o.o_custkey, sum(l.l_extendedprice) as s
FROM orders o, lineitem l
WHERE o.o_custkey = l.l_orderkey
GROUP BY o.o_custkey
ORDER BY s
LIMIT 10;
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud databaseParallel QueryParallel ExecutionSQL Performancecost‑based optimizerTPCH benchmark
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.