Big Data 12 min read

Deploying Apache Flink 1.12 on Kubernetes: High‑Availability Architecture and DataStream Batch Execution

This article explains how Flink 1.12 introduces production‑grade Kubernetes high‑availability, details the underlying architecture and deployment steps, and shows how the DataStream API can run in batch mode using runtime‑mode configuration and example commands.

Big Data Technology & Architecture

Feb 1, 2021

Deploying Apache Flink 1.12 on Kubernetes: High‑Availability Architecture and DataStream Batch Execution

Since its early releases Flink lacked some critical features, but starting with version 1.12 it provides a production‑grade high‑availability solution on Kubernetes, allowing JobManager failover without ZooKeeper and enabling DataStream to run in batch mode.

High‑Availability on Kubernetes

Flink 1.12 integrates with Kubernetes (K8s) to manage resources, using ConfigMap objects to store metadata needed for JobManager recovery. The client submits a resource description file (configmap, services, deployments) to the K8s API server, which creates the necessary pods.

The workflow includes:

Flink client connects to the K8s API server and submits the cluster description.

K8s creates JobManager and TaskManager pods based on the deployments.

The user submits a job via the Flink client; the job graph is uploaded through the REST client.

JobMaster requests slots from the KubernetesResourceManager, which allocates TaskManager pods.

TaskManagers register with the SlotManager and receive slots to execute the job.

Key advantages of K8s over YARN include native container isolation, built‑in monitoring (e.g., Prometheus), and better multi‑tenant support, though K8s may incur higher operational complexity.

1. K8s as the de‑facto container management standard offers resource and network isolation, security, and multi‑tenant advantages<br/>2. Seamless integration with cloud‑native monitoring systems like Prometheus<br/>3. YARN lacks load‑balance and mixed‑deployment features, resulting in lower resource utilization

Deployment examples (native Kubernetes deployments) use ConfigMaps for configuration files such as flink-conf.yaml and define services and deployments for JobManager and TaskManager.

high-availability: org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory<br/>high-availability.storageDir: s3:///flink/recovery<br/>kubernetes.cluster-id: cluster1337

# Create ConfigMap and services
$ kubectl create -f flink-configuration-configmap.yaml
$ kubectl create -f jobmanager-service.yaml
# Deploy JobManager and TaskManager
$ kubectl create -f jobmanager-session-deployment.yaml
$ kubectl create -f taskmanager-session-deployment.yaml

Sample JobManager deployment (YAML) and TaskManager deployment are provided, showing container images, ports, liveness probes, and volume mounts for configuration.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-jobmanager
spec:
  replicas: 1
  selector:
    matchLabels:
      app: flink
      component: jobmanager
  template:
    metadata:
      labels:
        app: flink
        component: jobmanager
    spec:
      containers:
      - name: jobmanager
        image: flink:1.12.0-scala_2.11
        args: ["jobmanager"]
        ports:
        - containerPort: 6123
          name: rpc
        - containerPort: 6124
          name: blob-server
        - containerPort: 8081
          name: webui
        livenessProbe:
          tcpSocket:
            port: 6123
          initialDelaySeconds: 30
          periodSeconds: 60
        volumeMounts:
        - name: flink-config-volume
          mountPath: /opt/flink/conf
        securityContext:
          runAsUser: 9999
      volumes:
      - name: flink-config-volume
        configMap:
          name: flink-config
          items:
          - key: flink-conf.yaml
            path: flink-conf.yaml
          - key: log4j-console.properties
            path: log4j-console.properties

DataStream API Batch Execution

Flink 1.12 adds a runtime mode configuration ( execution.runtime-mode) that lets the DataStream API run in BATCH, STREAMING, or AUTOMATIC mode, unifying batch and stream processing.

$ bin/flink run -Dexecution.runtime-mode=BATCH examples/streaming/WordCount.jar

In code the mode can be set as:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeExecutionMode.BATCH);

The article also outlines differences between streaming and batch execution, such as trigger behavior, state backend usage, watermark relevance, failure handling, and supported operators.

Overall, the guide demonstrates how to configure, deploy, and manage a Flink 1.12 session cluster on Kubernetes with high availability and how to leverage the new batch capabilities of the DataStream API.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native High Availability kubernetes Apache Flink DataStream batch mode

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.