Databases 31 min read

Mastering Oracle Parallel Query: How It Works and When to Use It

This article explains Oracle's parallel query feature, covering its benefits, resource costs, required conditions, various data‑distribution methods such as broadcast, replicate and hash, how to read parallel execution plans, and practical monitoring techniques to avoid performance pitfalls.

dbaplus Community
dbaplus Community
dbaplus Community
Mastering Oracle Parallel Query: How It Works and When to Use It

What Is Parallel Query?

Oracle Enterprise Edition provides a powerful parallel query capability that allows a single SQL statement to hire multiple server processes (PX slaves) to produce the result set, leveraging CPU, I/O and memory resources.

Costs and Considerations

Spawning parallel processes takes a short amount of time; if no idle processes are available the OS must start new threads, causing os thread startup waits.

The QC process must assign work to each PX slave, which also consumes time.

If the query returns a large result set, the single QC process can become a bottleneck.

A typical DOP of 4 may actually employ 8 PX slaves plus the QC, so nine system processes are used.

On Exadata, even serial queries run parallel at the I/O layer, so the parallel benefit is reduced.

Requirements for Effective Parallelism

A well‑optimized execution plan; a poor plan limits parallel gains.

Sufficient system resources (CPU, I/O, memory) must be available.

Workload should be evenly distributed among PX slaves to avoid skew.

Single‑Table Parallel Example

A simple query on table test with DOP=2 shows a PX BLOCK ITERATOR operation that splits the table by ROWID or partition, each PX slave scans a range, aggregates locally, and sends results to the QC for final aggregation.

SQL Monitoring visualizes the PX slaves (blue) doing the heavy work and the QC (row source ID 3) performing the final aggregation.

Table Queue and Producer‑Consumer Model

Parallel execution follows a producer‑consumer model: one set of PX slaves (producers) scans data and writes it to a table queue; another set (consumers) reads from the queue, processes the data, and may send results back to the QC.

Broadcast Distribution

In broadcast, each producer PX slave scans the left‑hand table, writes the rows to a queue, and the data is broadcast to every consumer PX slave. Consumers therefore hold a full copy of the left table, eliminating the need for further distribution of the right‑hand table.

Replicate Distribution

Replicate works like broadcast but without explicit queueing: every PX slave scans the entire left‑hand table, builds a complete hash table, and the right‑hand table is scanned once per slave without additional distribution.

Hash Distribution

When both tables are large, Oracle uses hash distribution. The producer PX slaves hash‑partition the left table, write partitions to a queue, and the consumer PX slaves receive only the rows belonging to their hash bucket, building partial hash tables. This reduces memory pressure compared with broadcast.

Hybrid‑Hash Adaptive Distribution

Oracle 12c can adaptively choose between broadcast and hash based on runtime statistics collected by the statistics collector. If the estimated result size is less than DOP × 2, broadcast is chosen; otherwise hash distribution is used.

Reading Parallel Execution Plans

To understand a parallel plan, ignore PX‑related operations and read the plan in the order of table‑queue creation (the smallest queue ID first). This reveals the true execution sequence of producers and consumers.

Monitoring with V$PQ_TQSTAT

The view V$PQ_TQSTAT records table‑queue activity only after a parallel query finishes. It shows how many rows each PX slave wrote to or read from a queue, helping detect skew and verify that work was evenly distributed.

Parallel Degree Degradation and Monitoring

Even when a query requests a high DOP, the actual degree may be reduced due to insufficient parallel processes or resource manager limits. SQL Monitoring (available from Oracle 12.1) displays requested vs. actual DOP, the number of allocated PX slaves, and the percentage of degradation.

To avoid degradation, enable automatic parallel execution management (APM) and, if needed, specify a hint to force a higher degree; the optimizer will then queue the query until enough resources become available.

Key Takeaways

Parallel query can dramatically reduce execution time but only when system resources are sufficient and the plan is well‑tuned.

Choose the appropriate distribution method (broadcast, replicate, hash, or hybrid‑hash) based on table sizes and expected result cardinality.

Use SQL Monitoring and V$PQ_TQSTAT to verify that work is evenly split and to diagnose bottlenecks such as QC contention or queue overflow.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OracleDatabase PerformanceHash JoinParallel QueryParallel ExecutionSQL MonitoringTable Queue
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.