Understanding Spark Executor Memory Management and the Unified Memory Model
The article explains Spark’s executor memory layout under the UnifiedMemoryManager, detailing on‑heap and off‑heap divisions, the four memory regions, default fraction settings, how storage and execution memory share space, and provides heuristics and tuning tips for avoiding OOM and optimizing performance.
Spark leverages memory efficiently for distributed computation, and its memory management subsystem is crucial for performance tuning and troubleshooting. An application runs with a Driver JVM (control node) and one or more Executor JVMs (worker nodes). The article focuses on the memory layout of the Executor, which is the primary target for optimization.
Memory manager evolution : Spark 1.6 and earlier used StaticMemoryManager. Since Spark 1.6 the default is UnifiedMemoryManager, which can be enabled explicitly via spark.memory.useLegacyMode to fall back to the static manager.
Executor memory layout (default, no off‑heap):
The total memory a Spark executor can use is limited by yarn.scheduler.maximum-allocation-mb (named MonitorMemory). It is divided into two parts:
Off‑heap memory – size defined by spark.yarn.executor.memoryOverhead (default executorMemory * 0.10, minimum 384 MB). Used for JVM overhead, NIO buffers, etc.
On‑heap memory (ExecutorMemory) – configured with --executor-memory or spark.executor.memory (JVM -Xmx).
When UnifiedMemoryManager is active, the on‑heap memory is logically split into four regions:
Execution Memory – temporary data for shuffle, join, sort, aggregation.
Storage Memory – cached RDDs, broadcast variables, unrolled data.
User Memory – metadata such as RDD lineage.
Reserved Memory – fixed 300 MB reserved for Spark internal objects.
The usable memory is calculated as: usableMemory = executorMemory - reservedMemory and then distributed using the following fractions (default values in Spark 2+):
spark.memory.fraction = 0.6 spark.memory.storageFraction = 0.5Thus, StorageMemory = usableMemory × spark.memory.fraction × spark.memory.storageFraction (≈30 % of usable memory) and ExecutionMemory receives the remaining portion. Both regions can dynamically borrow from each other when one runs out of space.
Off‑heap memory support (Spark 1.6+): enable with spark.memory.offHeap.enabled=true and set size via spark.memory.offHeap.size. When enabled, total storage memory becomes the sum of on‑heap and off‑heap storage capacities.
Heuristics and tuning tips :
Executor JVM Used Memory Heuristic : If configured executor memory far exceeds the JVM’s peak usage, reduce spark.executor.memory.
Unified Memory Heuristic : If the unified memory (storage + execution) is over‑provisioned, lower spark.memory.fraction.
OOM handling : Increase per‑task memory, reduce parallelism (lower spark.executor.cores), or optimize application logic (e.g., replace groupByKey with reduceByKey).
Execution Memory Spill : When execution memory overflows, data is spilled to disk; address by increasing memory or reducing data size per task.
GC tuning : Use JVM flags -verbose:gc, -XX:+PrintGCDetails, -XX:+PrintGCTimeStamps and refer to Spark’s GC tuning guide.
YARN container kill : If total container memory (including child processes) exceeds the allocated limit, YARN kills the executor. Adjust spark.yarn.executor.memoryOverhead or reduce the number of cores/tasks.
Example calculations (executor‑memory = 18 GB, off‑heap disabled):
systemMemory = 18 GB = 19327352832 bytes reservedMemory = 300 MB = 314572800 bytes usableMemory = systemMemory - reservedMemory = 19012780032 bytes StorageMemory = usableMemory × 0.6 = 11407668019.2 bytes ≈ 10.1 GB (displayed by Spark UI, which uses decimal GB)When off‑heap is enabled (10 GB off‑heap, same on‑heap settings), total storage memory becomes ≈ 20.9 GB.
The article also provides a list of recommended readings and references for deeper exploration of Spark memory management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
