What’s New in Apache Flink 1.17? Key Features, Performance Gains, and Streaming Warehouse Advances
Apache Flink 1.17 introduces a suite of batch and streaming enhancements—including a new Streaming Warehouse API, significant TPC‑DS performance boosts, adaptive batch scheduling, improved checkpointing, expanded SQL capabilities, Hive connector upgrades, and broader filesystem support—while also delivering upgrades to FRocksDB, Calcite, and the token framework to strengthen its position as a leading unified data‑processing engine.
Apache Flink 1.17.0 Release
The Flink Project Management Committee released Flink 1.17.0, contributed by 172 developers, 7 FLIPs and over 600 issues. The release focuses on a streaming‑warehouse model, adding batch‑mode row‑level updates, performance‑oriented optimizer changes, and new runtime features.
Batch Processing Enhancements
Streaming Warehouse API (FLIP‑282) : Introduces DELETE and UPDATE statements for batch tables, enabling row‑level modifications in external stores (e.g., Flink Table Store). The ALTER TABLE syntax is extended to support ADD/MODIFY/DROP columns, primary keys and watermarks.
Performance Optimizations : A new join‑reorder algorithm, adaptive local hash aggregation, Hive aggregation improvements and hybrid shuffle mode together deliver up to 26 % TPC‑DS speedup on a 10 TB dataset compared with Flink 1.16. Adaptive batch scheduling is now enabled by default, automatically deriving parallelism per job vertex based on data volume.
Hybrid Shuffle Mode : Reuses intermediate data, works with the adaptive batch scheduler and predictive execution, and improves stability for large‑scale production workloads.
Streaming Processing Enhancements
Streaming SQL Semantic Fixes : Resolves nondeterministic operation issues and adds the experimental PLAN_ADVICE feature, which warns about correctness risks and suggests optimizer improvements. Example output is shown below.
Checkpoint Improvements : Generic Incremental Checkpoint (GIC) reduces checkpoint duration by ~79.5 % and incremental size by ~95 % (see example REST API for manual triggering). Unaligned Checkpoint (UC) is production‑ready, lowering checkpoint latency under back‑pressure.
Watermark Alignment (FLIP‑217) : Aligns watermark emission across source splits, reducing downstream buffering and improving overall stream efficiency.
State Backend Upgrade : FRocksDB upgraded to version 6.20.3‑ververica‑2.0, adding Apple Silicon support, shared memory between TaskManager slots, a new periodic_compaction_seconds option, and performance gains by avoiding expensive toString() calls in compaction filters.
Predictive Execution for Sinks
Sink operators now support predictive execution. Built‑in sinks (DiscardingSink, PrintSink, FileSink, HiveTableSink, etc.) can obtain the attempt number of the current sub‑task and isolate output data from concurrent attempts. The slow‑task detector also considers input data volume, mitigating data‑skew effects.
SQL Client / Gateway
A new gateway mode allows users to submit SQL statements to a remote SQL Gateway and manage job lifecycles (list, stop) via SQL, providing functionality comparable to the Flink CLI.
Hive Connector Improvements
Automatic file merging is now available in batch mode, reducing the number of small files.
Native Hive aggregation functions (SUM, COUNT, AVG, MIN, MAX) are executed on hash‑based aggregation operators for better performance.
Streaming FileSink Extension
The FileSink now supports five filesystems: HDFS, S3, OSS, ABFS and local, broadening storage options for streaming jobs.
Calcite Upgrade
Calcite upgraded to 1.29.0 , fixing bugs (CALCITE‑4325, CALCITE‑4352) and improving SQL optimizer performance.
Other Notable Changes
PyFlink now runs on Python 3.10 and Apple Silicon, with improved cross‑process communication and UDF type handling.
Task‑level flame graphs provide detailed performance visualisation per sub‑task.
Generalized delegation token framework (FLIP‑272) and Kerberos token improvements (FLIP‑211) extend authentication support beyond Hadoop.
Upgrade Guidance
When migrating to Flink 1.17, adjust configuration parameters such as state.backend.rocksdb.memory.fixed-per-tm to control shared memory allocation. Refer to the official release notes for a complete list of required changes.
Example: PLAN_ADVICE Output
== Optimized Physical Plan With Advice ==
...advice[1]: [WARNING] The column(s): day(generated by non-deterministic function: CURRENT_TIMESTAMP) cannot satisfy the determinism requirement for correctly processing update message('UB'/'UA'/'D' in changelogMode, not 'I' only)...References
FLIP‑282: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235838061
FLIP‑217: https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
FRocksDB repository: https://github.com/ververica/frocksdb
Sink interface (new API): https://github.com/apache/flink/blob/release-1.17/flink-core/src/main/java/org/apache/flink/api/connector/sink2/Sink.java
OutputFormat sink example: https://github.com/apache/flink/blob/release-1.17/flink-core/src/main/java/org/apache/flink/api/common/io/OutputFormat.java
Hybrid Shuffle documentation: https://nightlies.apache.org/flink/flink-docs-release-1.17/zh/docs/ops/batch/batch_shuffle/#hybrid-shuffle
HiveModule functions: https://nightlies.apache.org/flink/flink-docs-release-1.17/zh/docs/connectors/table/hive/hive_functions/
PLAN_ADVICE documentation: https://nightlies.apache.org/flink/flink-docs-release-1.17/zh/docs/dev/table/sql/explain/#explaindetails
CALCITE‑4325 issue: https://issues.apache.org/jira/browse/CALCITE-4325
CALCITE‑4352 issue: https://issues.apache.org/jira/browse/CALCITE-4352
FLINK‑29849, FLINK‑30006, FLINK‑30841 (checkpoint optimizer fixes): https://issues.apache.org/jira/browse/FLINK-29849, https://issues.apache.org/jira/browse/FLINK-30006, https://issues.apache.org/jira/browse/FLINK-30841
FLINK‑30836 (RocksDBStateBackend memory config): https://issues.apache.org/jira/browse/FLINK-30836
Release notes: https://nightlies.apache.org/flink/flink-docs-release-1.17/release-notes/flink-1.17/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
