Understanding Spark Executor Memory Management: On‑heap, Off‑heap, and Unified Approaches
This article explains Spark's executor memory architecture, covering on‑heap and off‑heap allocation, static versus unified memory managers, storage and execution memory handling, RDD persistence, eviction policies, and the role of Tungsten's page‑based management in optimizing performance.
1. On‑heap and Off‑heap Memory Planning
Each Spark executor runs as a JVM process, so its memory management builds on JVM memory. Spark explicitly partitions the JVM heap (on‑heap) and also allocates off‑heap memory directly from the system RAM to improve utilization and reduce garbage‑collection overhead.
1.1 On‑heap Memory
The size of on‑heap memory is set by the --executor-memory or spark.executor.memory configuration. Within the heap, Spark divides memory into storage (for cached RDDs and broadcast data), execution (for shuffle and aggregation), and the remaining space for JVM objects and user‑defined objects. Allocation varies with the chosen memory management mode.
Memory acquisition follows a logical planning process: Spark creates an object, the JVM allocates heap space, and Spark records the reference. Release occurs when Spark removes the reference and the JVM garbage collector eventually frees the space.
1.2 Off‑heap Memory
Off‑heap memory is optional and enabled via spark.memory.offHeap.enabled. Its size is defined by spark.memory.offHeap.size. Off‑heap storage holds serialized binary data, allowing precise size calculation and eliminating extra GC scans. Off‑heap memory shares the same storage and execution partitions as on‑heap memory.
1.3 Memory Management Interface
Spark provides a unified MemoryManager API for both storage and execution memory. Key methods include:
def acquireStorageMemory(blockId: BlockId, numBytes: Long, memoryMode: MemoryMode): Boolean def acquireUnrollMemory(blockId: BlockId, numBytes: Long, memoryMode: MemoryMode): Boolean def acquireExecutionMemory(numBytes: Long, taskAttemptId: Long, memoryMode: MemoryMode): Long def releaseStorageMemory(numBytes: Long, memoryMode: MemoryMode): Unit def releaseExecutionMemory(numBytes: Long, taskAttemptId: Long, memoryMode: MemoryMode): Unit def releaseUnrollMemory(numBytes: Long, memoryMode: MemoryMode): UnitThe memoryMode argument determines whether the operation targets on‑heap or off‑heap memory.
2. Memory Space Allocation
2.1 Static Memory Management
In the original static model, storage, execution, and other memory regions are fixed for the lifetime of the application. Users configure the fractions via parameters such as spark.storage.memoryFraction and spark.shuffle.memoryFraction. The usable heap memory is calculated as:
Usable storage memory =
systemMaxMemory * spark.storage.memoryFraction * spark.storage.safetyFractionUsable execution memory =
systemMaxMemory * spark.shuffle.memoryFraction * spark.shuffle.safetyFractionThe safety fraction reserves a logical buffer to reduce OOM risk, but the JVM ultimately manages this space like any other memory.
2.2 Unified Memory Management
Introduced after Spark 1.6, the unified manager allows storage and execution memory to share a common pool, dynamically borrowing idle space from each other. Figures illustrate heap and off‑heap sharing.
The dynamic borrowing rules are:
Define baseline storage and execution fractions via spark.storage.storageFraction.
If both sides lack space, data is spilled to disk; if one side has free space, it can borrow from the other.
When execution memory borrows storage space, the borrowed portion can be evicted to disk and returned.
Storage cannot simply return borrowed space because shuffle constraints make it complex.
3. Storage Memory Management
3.1 RDD Persistence
RDDs are immutable partitioned collections. Persistence (via persist or cache) stores RDD partitions in memory or on disk, reducing recomputation for subsequent actions. The default MEMORY_ONLY level caches data in the heap without serialization.
3.2 RDD Caching Process
Before caching, a partition is accessed as an Iterator. When cached, the partition becomes a Block stored either on‑heap or off‑heap. Blocks may be serialized ( SerializedMemoryEntry) or deserialized ( DeserializedMemoryEntry) depending on the storage level.
During unroll, the executor requests temporary unroll space from the MemoryManager. Serialized partitions can compute required space directly; non‑serialized partitions estimate space per record, potentially causing under‑ or over‑estimation.
3.3 Eviction and Disk Spill
When storage memory is full, Spark evicts old blocks using an LRU policy, ensuring the evicted block shares the same MemoryMode and belongs to a different RDD that is not currently being read. If the block’s storage level includes disk persistence ( _useDisk = true), it is spilled to disk; otherwise it is simply discarded.
4. Execution Memory Management
4.1 Task‑level Memory Allocation
Each executor’s tasks share execution memory. A task may request between 1/2N and 1/N of the executor’s execution memory, where N is the number of concurrent tasks. If insufficient memory is available, the task blocks until memory is freed.
4.2 Shuffle Memory Usage
Shuffle write uses either ExternalSorter (heap execution memory) or ShuffleExternalSorter (heap or off‑heap depending on Tungsten settings). Shuffle read aggregates data with Aggregator (heap) and may sort results with ExternalSorter (heap).
During shuffle, Spark stores intermediate data in an AppendOnlyMap. When the map grows beyond available execution memory, Spark spills the data to disk and later merges the spill files.
4.3 Tungsten Page‑based Management
Tungsten abstracts memory pages via MemoryBlock, which can represent either on‑heap (array + offset) or off‑heap (direct address) memory. Each page is identified by a 64‑bit logical address composed of a 13‑bit page number and a 51‑bit offset. This uniform addressing enables efficient sorting without deserialization, improving both memory access speed and CPU utilization.
Conclusion
Spark’s memory management combines on‑heap, off‑heap, static, and unified strategies to balance storage and execution needs. Understanding the allocation formulas, eviction policies, and Tungsten’s page‑level handling helps developers tune performance and avoid common pitfalls such as excessive garbage collection or OOM errors.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
