Big Data 54 min read

Apache Flink Release History and Key Features from 1.7 to 1.12

This article provides a comprehensive overview of Apache Flink's major releases from version 1.7 through 1.12, detailing new functionalities such as Scala 2.12 support, state schema evolution, Blink planner integration, Kubernetes native deployment, Python (PyFlink) enhancements, and numerous performance and stability improvements for stream and batch processing.

Big Data Technology & Architecture

Aug 25, 2021

Apache Flink Release History and Key Features from 1.7 to 1.12

Flink 1.7

Introduced full Scala 2.12 support, Exactly‑once semantics for S3 StreamingFileSink, state schema evolution, MATCH_RECOGNIZE in Streaming SQL, temporal tables and joins, a versioned REST API, Kafka 2.0 connector, and local recovery to speed up failover.

Flink 1.8

Added the final Schema Evolution story, TTL‑based continuous cleanup of old state, user‑defined functions and aggregates in SQL, RFC‑compliant CSV format, a new KafkaDeserializationSchema exposing ConsumerRecord, Kinesis watermark options, DynamoDB stream support, and global aggregation for sub‑task coordination.

Flink 1.9

Released in August 2019 with batch‑oriented fine‑grained recovery, a Blink‑based query engine for Table API/SQL (preview), State Processor API for flexible savepoint manipulation, stop‑with‑savepoint semantics, a redesigned Web UI, and preview Hive integration.

Flink 1.10

Marked as the largest release to date, bringing extensive performance and stability optimizations, native Kubernetes integration (beta), PyFlink enhancements, full Blink integration, production‑ready Hive support, expanded SQL DDL (watermarks, temporal tables), and a pluggable module system for built‑in functions.

Flink 1.11

Introduced unaligned checkpoints, a new unified Source API, CDC support in Table & SQL, JDBC catalog for relational databases, enhanced Hive real‑time warehousing, application‑mode deployment, and major Python improvements including vectorized Pandas UDFs and Cython‑accelerated UDFs.

Flink 1.12

Unified DataStream API with batch execution mode, Kubernetes‑based high‑availability without ZooKeeper, upsert‑Kafka connector, metadata columns in SQL, temporal table joins via FOR SYSTEM_TIME AS OF, new Kinesis connector, sort‑merge shuffle (experimental), and extensive Table API/SQL type‑system and optimizer upgrades.

Command‑line examples

./bin/flink run -d -e kubernetes-session -Dkubernetes.cluster-id=examples/streaming/WindowJoin.jar

bin/flink stop -p [:targetSavepointDirectory] :jobId

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeMode.BATCH);

from pyflink.datastream import StreamExecutionEnvironment, MapFunction
class MyMapFunction(MapFunction):
    def map(self, value):
        return value + 1
env = StreamExecutionEnvironment.get_execution_environment()
env.from_collection([1,2,3,4,5]).map(MyMapFunction()).print()
env.execute("datastream job")

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Apache Flink PyFlink Table API Version Release

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.