Understanding Spark Memory Management: On‑heap, Off‑heap, and Unified Memory
This article provides a comprehensive overview of Spark's memory management, covering executor memory architecture, the differences between on‑heap and off‑heap memory, static versus unified memory managers, storage and execution memory handling, and practical guidelines for optimizing Spark applications.
Spark, an in‑memory distributed computing engine, relies heavily on its memory management module to achieve high performance. This article explains the fundamentals of Spark memory management, focusing on executor memory, the distinction between on‑heap and off‑heap memory, and the evolution from static to unified memory managers.
1. On‑heap and Off‑heap Memory Planning
Executor memory is built on JVM memory management. Spark allocates on‑heap (On‑heap) space in detail and introduces off‑heap (Off‑heap) memory to allocate directly in system memory, improving utilization.
1.1 On‑heap Memory
The size of on‑heap memory is configured via --executor-memory or spark.executor.memory. Concurrent tasks share this heap, with storage memory for cached RDDs and broadcast data, execution memory for shuffle, and the remaining space for other objects. Spark records memory allocation and release by tracking object references.
1.2 Off‑heap Memory
Off‑heap memory stores serialized binary data directly in system memory, reducing GC overhead. It is enabled with spark.memory.offHeap.enabled and sized with spark.memory.offHeap.size. Off‑heap shares the same storage and execution partitions as on‑heap.
1.3 Memory Management Interface
Spark provides a unified MemoryManager interface for storage and execution memory. Key methods include:
def acquireStorageMemory(blockId: BlockId, numBytes: Long, memoryMode: MemoryMode): Boolean
def acquireUnrollMemory(blockId: BlockId, numBytes: Long, memoryMode: MemoryMode): Boolean
def acquireExecutionMemory(numBytes: Long, taskAttemptId: Long, memoryMode: MemoryMode): Long
def releaseStorageMemory(numBytes: Long, memoryMode: MemoryMode): Unit
def releaseExecutionMemory(numBytes: Long, taskAttemptId: Long, memoryMode: MemoryMode): Unit
def releaseUnrollMemory(numBytes: Long, memoryMode: MemoryMode): UnitThe implementation defaults to Unified Memory Manager (post‑Spark 1.6) but can fall back to the legacy static manager via spark.memory.useLegacyMode.
2. Memory Space Allocation
2.1 Static Memory Management
In the original static model, storage, execution, and other memory sizes are fixed for the lifetime of the application. Available storage memory is calculated as:
availableStorageMemory = systemMaxMemory * spark.storage.memoryFraction * spark.storage.safetyFractionAvailable execution memory follows a similar formula using spark.shuffle.memoryFraction. This rigid partitioning can lead to imbalanced usage.
2.2 Unified Memory Management
Unified memory management allows storage and execution memory to share a common pool, dynamically borrowing idle space from each other. When both sides lack space, data is spilled to disk; otherwise, one side can borrow from the other, improving overall utilization.
3. Storage Memory Management
3.1 RDD Persistence
RDDs are persisted using the Storage module, which decouples logical RDDs from physical storage. Persistence levels (e.g., MEMORY_ONLY, MEMORY_AND_DISK) are defined by the StorageLevel class with flags for disk, on‑heap, off‑heap, serialization, and replication.
3.2 Caching Process
When caching, a Partition is transformed into a Block. Unrolling converts non‑contiguous iterator data into a contiguous memory region, either serialized ( SerializedMemoryEntry) or deserialized ( DeserializedMemoryEntry). Unroll space is requested from the MemoryManager; failure triggers eviction.
3.3 Eviction and Disk Spill
Blocks are evicted based on LRU order, respecting memory mode, RDD identity, and read status. If the block’s storage level includes disk, it is spilled; otherwise it is simply discarded.
4. Execution Memory Management
4.1 Multi‑Task Allocation
Tasks share execution memory, each receiving between 1/2N and 1/N of the pool (N = number of concurrent tasks). A task blocks until sufficient memory is granted.
4.2 Shuffle Memory Usage
Shuffle write uses either ExternalSorter (heap) or ShuffleExternalSorter (off‑heap if Tungsten is enabled). Shuffle read aggregates data in the executor’s heap memory. When execution memory is exhausted, Spark spills data to disk (spill) and later merges it.
4.3 Tungsten Page‑Based Memory
Tungsten abstracts memory pages with 64‑bit logical addresses, allowing uniform handling of on‑heap and off‑heap pages. Each page is represented by a MemoryBlock (object reference for heap, null for off‑heap) and managed via a page table.
Overall, Spark’s storage and execution memory employ distinct management strategies: storage uses a LinkedHashMap of Blocks, while execution leverages AppendOnlyMap and Tungsten’s page‑based system to optimize shuffle and task performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
