Understanding Flink's Memory Model: On‑Heap, Off‑Heap, and Memory Management
This article explains Flink's memory architecture, covering on‑heap and off‑heap memory concepts, garbage collection, allocation strategies, memory segments, buffers, the memory manager, and how network transmission and back‑pressure are handled to achieve efficient streaming processing.
Introduction
Before diving into the memory model, some basic knowledge is required.
1. On‑Heap Memory
1.1 What is on‑heap memory
The JVM divides the memory it manages into several regions; the largest region is the heap, which stores all object instances. The heap is created at JVM startup, shared by all threads, and is the main area for garbage collection, often called the GC heap.
1.2 Garbage collection of on‑heap memory
The heap is divided into Young, Old, and Permanent generations. The Young generation follows an 8:1:1 ratio (Eden, Survivor1, Survivor2). During garbage collection, algorithms may cause physical memory fragmentation. Full GC scans the entire heap, pausing most Java threads; the impact of GC is proportional to the amount of data stored in the heap.
2. Off‑Heap Memory
2.1 Origin of off‑heap memory
To reduce long GC pauses and make memory visible to the operating system, Flink uses off‑heap memory, which is managed directly by the OS and can be accessed by other processes or devices (e.g., GPU).
2.2 Off‑heap allocation
Java NIO provides ByteBuffer for off‑heap access. The following code allocates a 10 MB off‑heap buffer:
import sun.nio.ch.DirectBuffer;
import java.nio.ByteBuffer;
public class TestDirectByteBuffer {
public static void main(String[] args) throws Exception {
while (true) {
ByteBuffer buffer = ByteBuffer.allocateDirect(10 * 1024 * 1024);
}
}
}This creates a 10 MB off‑heap memory region.
3. Pros, Cons and Relation to On‑Heap Memory
3.1 Advantages of off‑heap
Can easily allocate large memory blocks, good scalability for big memory.
Reduces GC‑induced pause time.
Managed directly by OS, accessible by other processes/devices, avoiding extra copies.
Ideal for scenarios with few allocations but frequent reads/writes.3.2 Disadvantages of off‑heap
Prone to memory leaks, hard to debug.
Complex object structures require serialization, which can be costly.3.3 Relationship between on‑heap and off‑heap
Although off‑heap memory is not subject to GC, the ByteBuffer object that represents it lives on the heap, storing metadata such as the off‑heap address. When the buffer object is reclaimed, the corresponding off‑heap region is released as well.
0 Overview
Flink uses its own memory management to overcome JVM limitations:
Low object storage density in the JVM.
Full GC can cause seconds‑to‑minutes pauses.
OutOfMemoryError threatens stability.
Cache misses due to non‑contiguous object layout.
Flink serializes objects into pre‑allocated MemorySegment blocks (default 32 KB), allowing direct binary read/write without deserialization.
1 Memory Model
1.1 JobManager Memory Model
Configuration files such as JobManagerFlinkMemory.java unify TM memory management. In Flink 1.11 the JM options align with TM options.
1.2 TaskManager Memory Model
Since Flink 1.10, TM memory model and configuration options have been overhauled, allowing stricter control of memory consumption.
Key memory categories (all configurable):
Framework Heap Memory (default 128 MB)
Task Heap Memory (configurable)
Off‑Heap Memory: Framework Off‑Heap, Task Off‑Heap, Network Memory
Managed Memory (used by sort, hash tables, RocksDB, etc.)
JVM specific memory and overhead
Example configuration snippets:
taskmanager.memory.network.fraction: 0.1
taskmanager.memory.network.min: 64mb
taskmanager.memory.network.max: 1gb taskmanager.memory.managed.fraction=0.4
taskmanager.memory.managed.size taskmanager.memory.jvm-overhead.min=192mb
taskmanager.memory.jvm-overhead.max=1gb
taskmanager.memory.jvm-overhead.fraction=0.12 Memory Data Structures
MemorySegment : the smallest allocation unit (default 32 KB). It can be a heap byte array or an off‑heap DirectByteBuffer. Provides efficient binary read/write.
MemoryPage : a view on top of MemorySegment, exposing DataInputView and DataOutputView for seamless cross‑segment reads/writes.
Buffer : wraps a MemorySegment for network transmission. Implemented by NetworkBuffer, each containing one MemorySegment.
BufferPool : manages buffers, providing allocation, release, and destruction. Implemented by LocalBufferPool; each Task has its own pool. BufferPoolFactory creates a single NetworkBufferPool per TaskManager.
3 Memory Manager
The MemoryManager handles memory used for sorting, hash tables, intermediate results, and off‑heap state back‑ends (e.g., RocksDB). Prior to Flink 1.10 it managed all TM memory; from 1.10 onward it operates at the slot level.
4 Memory Management in Network Transmission
Data transmitted over the network is written to a Task's InputGate, processed, and then written to a ResultPartition. Both input and output buffers are backed by MemorySegment objects.
During startup, the TaskManager creates a NetworkEnvironment which holds a NetworkBufferPool (default 2048 buffers). Each Task gets a LocalBufferPool for its InputGate and ResultPartition. Buffers are allocated on demand; if a pool runs out, the Task is back‑pressured, causing upstream tasks to pause.
When a buffer is consumed, Buffer.recycle() returns it to the local pool, which may return it to the global pool if the local pool exceeds its capacity. This mechanism, together with Netty's high/low watermarks, implements Flink's robust back‑pressure system.
Overall, the fixed‑size buffer pools act like bounded queues, ensuring that the production rate of data never exceeds the consumption rate, and that back‑pressure propagates throughout the pipeline.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
