Big Data 14 min read

Deploying Apache Flink on Kubernetes: A Step‑by‑Step Guide

This tutorial explains how to run Apache Flink jobs on Kubernetes by building Docker images, deploying JobManager and TaskManager components with Kubernetes manifests, configuring high‑availability with ZooKeeper and HDFS, and using SavePoints and scaling techniques to manage and extend Flink streaming applications.

Big Data Technology & Architecture

Sep 21, 2019

Deploying Apache Flink on Kubernetes: A Step‑by‑Step Guide

Overview: Kubernetes is a popular container orchestration system that can run web services and big‑data processing applications such as Apache Flink. Combining the two yields a robust, scalable data‑processing platform.

Key steps to run a Flink job on Kubernetes (script‑cluster mode) include:

Compile and package the Flink job JAR.

Build a Docker image containing Flink runtime and the JAR.

Deploy the JobManager using a Kubernetes Job object.

Expose the JobManager via a Service (NodePort) for UI and REST API.

Deploy TaskManagers with a Deployment, adjusting replicas for scaling.

Configure high‑availability with ZooKeeper and HDFS if needed.

Use Flink SavePoints to stop, resume, or scale jobs.

Example Dockerfile for the base Flink image and a custom image that copies Hadoop and job JARs is provided. Commands for Minikube setup, Docker build, and kubectl operations are shown, as well as YAML manifests for JobManager, Service, TaskManager, and HA configurations.

The guide also discusses scaling strategies (replica count, flink modify) and notes current limitations of HA scaling. Finally, it mentions upcoming native Kubernetes support in Flink’s roadmap.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Big Data Flink High Availability kubernetes TaskManager JobManager Savepoint

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.