Apache Flink Release History and Key Features from 1.7 to 1.12
This article provides a comprehensive overview of Apache Flink's major releases from version 1.7 through 1.12, detailing new functionalities such as Scala 2.12 support, state schema evolution, Blink planner integration, Kubernetes native deployment, Python (PyFlink) enhancements, and numerous performance and stability improvements for stream and batch processing.
Flink 1.7
Introduced full Scala 2.12 support, Exactly‑once semantics for S3 StreamingFileSink, state schema evolution, MATCH_RECOGNIZE in Streaming SQL, temporal tables and joins, a versioned REST API, Kafka 2.0 connector, and local recovery to speed up failover.
Flink 1.8
Added the final Schema Evolution story, TTL‑based continuous cleanup of old state, user‑defined functions and aggregates in SQL, RFC‑compliant CSV format, a new KafkaDeserializationSchema exposing ConsumerRecord, Kinesis watermark options, DynamoDB stream support, and global aggregation for sub‑task coordination.
Flink 1.9
Released in August 2019 with batch‑oriented fine‑grained recovery, a Blink‑based query engine for Table API/SQL (preview), State Processor API for flexible savepoint manipulation, stop‑with‑savepoint semantics, a redesigned Web UI, and preview Hive integration.
Flink 1.10
Marked as the largest release to date, bringing extensive performance and stability optimizations, native Kubernetes integration (beta), PyFlink enhancements, full Blink integration, production‑ready Hive support, expanded SQL DDL (watermarks, temporal tables), and a pluggable module system for built‑in functions.
Flink 1.11
Introduced unaligned checkpoints, a new unified Source API, CDC support in Table & SQL, JDBC catalog for relational databases, enhanced Hive real‑time warehousing, application‑mode deployment, and major Python improvements including vectorized Pandas UDFs and Cython‑accelerated UDFs.
Flink 1.12
Unified DataStream API with batch execution mode, Kubernetes‑based high‑availability without ZooKeeper, upsert‑Kafka connector, metadata columns in SQL, temporal table joins via FOR SYSTEM_TIME AS OF, new Kinesis connector, sort‑merge shuffle (experimental), and extensive Table API/SQL type‑system and optimizer upgrades.
Command‑line examples
./bin/flink run -d -e kubernetes-session -Dkubernetes.cluster-id=examples/streaming/WindowJoin.jar bin/flink stop -p [:targetSavepointDirectory] :jobId StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeMode.BATCH); from pyflink.datastream import StreamExecutionEnvironment, MapFunction
class MyMapFunction(MapFunction):
def map(self, value):
return value + 1
env = StreamExecutionEnvironment.get_execution_environment()
env.from_collection([1,2,3,4,5]).map(MyMapFunction()).print()
env.execute("datastream job")Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
