Big Data 11 min read

Analysis of Spark Out‑Of‑Memory (OOM) Issues and Tuning Strategies

This article explains Spark's memory model, identifies common driver and executor OOM scenarios, and provides detailed configuration and code‑level recommendations—including memory‑related parameters and shuffle‑tuning options—to prevent and resolve out‑of‑memory failures in Spark applications.

Big Data Technology Architecture

Nov 13, 2021

Analysis of Spark Out‑Of‑Memory (OOM) Issues and Tuning Strategies

Spark divides the memory of each executor into three regions: execution (used for joins, aggregates, and shuffle buffering), storage (holds broadcast, cache, and persisted data), and other (reserved for internal use). OOM typically occurs in the execution region, while storage‑related overflow only discards old cached data.

Common OOM scenarios in Spark are:

Map‑stage memory overflow

Shuffle‑stage memory overflow

Driver memory overflow

Both map and shuffle overflows happen on executors, whereas driver overflow occurs on the driver process.

Driver Heap OOM

1. Creating large objects on the driver (e.g., huge collections). Solution: Move the large object to the executor (use sc.textFile, sc.hadoopFile, etc.) or increase driver-memory after evaluating the object's size.

2. Collecting large data from executors to the driver. The total data returned must be less than spark.driver.maxResultSize (default 1 GB). The actual memory needed can be 2‑5× larger after deserialization. Solution: Avoid large collect operations; instead, transform the data on executors, or increase driver memory if unavoidable.

3. Spark UI and internal bookkeeping consuming memory. Solution: Reduce the number of partitions for large stages, adjust spark.ui.retainedStages and spark.ui.retainedJobs, or increase memory if the above are insufficient.

Executor Heap OOM

Typical causes include:

Excessive object creation during map tasks

Data skew leading to uneven memory pressure

Improper use of coalesce Shuffle‑stage memory overflow

Specific cases:

Reduce‑stage OOM : Reduce tasks pull data from map tasks while aggregating, using a dedicated aggregation buffer (≈20 % of executor memory). Solution: Increase the aggregation memory fraction, enlarge executor memory ( --executor-memory 5G), or lower the amount pulled per request by setting spark.reducer.maxSizeInFlight (e.g., 24 M).

Shuffle file not found / executor lost : Occurs when shuffle files disappear or executors crash due to insufficient memory. Solution: Increase executor memory, raise off‑heap memory via spark.executor.memoryOverhead, and ensure stable network conditions.

Off‑heap memory issues : Insufficient off‑heap allocation can cause BlockManager failures and connection loss. Solution: Increase spark.executor.memoryOverhead (default 10 % of executor memory) to at least 1 GB, or more depending on workload.

Adjustable Shuffle Parameters

spark.shuffle.file.buffer

(default 32 k): Buffer size for shuffle write streams. Increase to 64 k when memory permits to reduce disk spills. spark.reducer.maxSizeInFlight (default 48 m): Buffer size for shuffle read tasks, controlling how much data is fetched per request. Raising to 96 m can lower network trips and improve performance. spark.shuffle.io.maxRetries (default 3): Maximum retry attempts for shuffle reads. For large, long‑running jobs, increase to 60 to improve stability. spark.shuffle.io.retryWait (default 5 s): Wait time between retry attempts. Extending to 60 s can further stabilize shuffle operations. spark.shuffle.memoryFraction (default 0.2): Portion of executor memory allocated to shuffle read aggregation. Increase when memory is abundant to reduce disk I/O. spark.shuffle.manager (default sort): Choose between hash, sort, and tungsten‑sort. Use hash with spark.shuffle.consolidateFiles=true for many small shuffle tasks to cut I/O. spark.shuffle.sort.bypassMergeThreshold (default 200): When the number of shuffle read tasks is below this threshold, sorting is bypassed. Raising the value can avoid unnecessary sorting overhead. spark.shuffle.consolidateFiles (default false): When true with hash manager, merges many small shuffle output files, reducing disk I/O for tasks with many shuffle reads.

memory management Performance Tuning OOM Spark

Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.