Big Data 16 min read

How Alibaba’s Blink Supercharges Flink for Massive Stream and Batch Processing

Alibaba’s Blink, an internal enhancement of Apache Flink, is now open‑sourced, bringing advanced runtime, SQL/TableAPI, Hive compatibility, Zeppelin integration, and a revamped Flink Web UI to dramatically boost performance and scalability for both streaming and batch workloads.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s Blink Supercharges Flink for Massive Stream and Batch Processing

Blink Overview

Alibaba announced that its internal Flink variant, Blink, will be open‑sourced at the end of January 2019, following the promise made at the Flink Forward China summit.

Blink has been used internally since 2015 to address Alibaba’s massive scale and stability challenges, evolving into a robust platform for both stream and batch processing, handling billions of messages per second and petabytes of data.

Background of Open‑sourcing Blink

Alibaba has contributed many Flink improvements back to the community since 2016, but the rapid development pace of Blink created a gap. To accelerate community adoption, Alibaba decided to open‑source the full Blink codebase rather than incremental contributions.

Open‑source Approach

Blink will not become a separate project; it will be merged as a branch into Apache Flink. The community has agreed to integrate Blink via FLIP‑32, aiming for a quick merge to benefit future features like machine learning.

Main Features and Optimizations

The open‑sourced Blink builds on Flink 1.5.1 and adds numerous new capabilities and performance/stability enhancements, including a high‑performance batch SQL engine, interactive programming support, tighter Zeppelin integration, and an improved Flink Web UI.

Runtime

Blink introduces a pluggable shuffle architecture, new scheduling options, operator chaining, zero‑copy pipeline shuffle, and broadcast shuffle optimizations. It also adds a JM fail‑over mechanism and native Kubernetes support for dynamic pod allocation.

SQL/TableAPI

The SQL engine has been refactored with a new Query Processor comprising an optimizer and executor, enabling unified execution for stream and batch workloads. BinaryRow data structures and extensive code‑generation reduce serialization overhead and boost performance. New features include mini‑batch execution, cache support, and full SQL DDL/DML capabilities.

Hive Compatibility

Blink adds an in‑memory catalog and a HiveCatalog that bridge Flink to Hive metastore, allowing Flink SQL to read Hive metadata and data, facilitating seamless switching between Hive and Flink engines.

Zeppelin for Flink

Enhanced Zeppelin integration provides better support for Flink, including savepoint handling, interactive SQL/TableAPI queries, and built‑in tutorials for streaming ETL, batch, and stream examples.

Flink Web

The Flink Web UI has been overhauled with Angular 7, offering detailed resource, task, and job metrics, improved log access, and a more responsive interface, doubling performance for large‑scale deployments.

Future Plans

Alibaba will continue merging Blink’s innovations into Flink, focusing next on machine‑learning support, multi‑language APIs (Python, Go), cluster management, notebooks, and broader ecosystem contributions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataFlinkstream processingBatch Processingblink
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.