Big Data 16 min read

How ByteDance Optimized Flink SQL for Real‑World Streaming at Scale

This article details ByteDance's practical experience with Apache Flink, covering SQL extensions, a visual SQL platform, performance tweaks such as window mini‑batching and custom windows, join and checkpoint recovery improvements, stream‑batch integration experiments, and future roadmap plans.

dbaplus Community
dbaplus Community
dbaplus Community
How ByteDance Optimized Flink SQL for Real‑World Streaming at Scale

Overall Introduction

ByteDance began using Apache Flink after Blink was open‑sourced in December 2018. By October 2019, a streaming SQL platform based on Flink 1.9 and the Blink planner was deployed internally, revealing several interesting use cases and bugs.

1. Flink 1.9 SQL Extensions

Because Flink 1.9 lacked DDL support, ByteDance added extensions for the following statements:

CREATE TABLE

CREATE VIEW

CREATE FUNCTION

ADD RESOURCE

The extensions also enabled DDL‑based watermark definitions.

To address the “SQL cannot express complex business logic” feedback, a set of RPC‑based dimension tables and sinks were introduced, allowing SQL to read/write RPC services, FaaS, Redis, Abase, Bytable, and ByteSQL.

Additional internal connectors were built:

Source: RocketMQ

Sink: RocketMQ, ClickHouse, Doris, LogHouse, Redis, Abase, Bytable, ByteSQL, RPC, Print, Metrics

Corresponding data formats (PB, Binlog, Bytes) were also implemented.

2. Visual SQL Platform

A web‑based SQL platform was launched, offering SQL editing, parsing, debugging, custom UDF/Connector development, version control, and job management.

Practice Optimizations

1. Window Performance Optimizations

Mini‑Batch for Windows : By batching records before state access, CPU usage dropped by 20‑30% in RocksDB‑backed state backends.

Custom Window Types : Added user‑defined windows to support scenarios like hourly UV/GMV calculations for live streaming, which standard tumbling, sliding, and session windows could not satisfy.

-- my_window 为自定义的窗口,满足特定的划分方式
SELECT
  room_id,
  COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
  room_id,
  my_window(ts, INTERVAL '1' HOURS)

2. Window Offset Support

Implemented offset parameters to define natural week windows starting on Monday rather than the Unix epoch Thursday.

SELECT
  room_id,
  COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
  room_id,
  TUMBLE(ts, INTERVAL '7' DAY, INTERVAL '3', DAY)

3. Dimension Table Optimizations

Delayed Join : When a dimension table join fails, the record is cached locally and retried according to configurable attempts.

Dimension Table Keyby : Exposed a hash‑based key configuration to partition dimension‑table joins, reducing cache pressure and improving hit rates.

Planned enhancements include broadcast dimension tables and Mini‑Batch support for high‑QPS dimension joins.

4. Join Optimizations

Analyzed three Flink join types (Interval, Regular, Temporal Table Function) and their drawbacks, then applied internal fixes to mitigate out‑of‑order data, retract amplification, and watermark stagnation.

5. Checkpoint Recovery Enhancements

Identified two main reasons for checkpoint recovery failures: operator ID changes and state schema mismatches after aggregation metric changes. Implemented a migration path by making old and new serializers aware of each other, allowing state compatibility checks, migration handling, and proper initialization for null aggregation results.

Stream‑Batch Integration Exploration

ByteDance investigated a unified SQL layer for both streaming and batch workloads. While some session‑window and over‑window features differ between the two, they affect less than 10% of use cases, making a unified approach feasible.

The proposed architecture processes streaming data with Flink, writes results to a message queue for downstream streaming jobs, and writes batch‑ready data to Hive via Flink, eliminating the need for separate engines and duplicated UDFs.

Business Benefits

Unified SQL for stream and batch reduces development and maintenance effort.

Shared UDFs across workloads.

Consistent engine simplifies learning and architecture upkeep.

Optimizations (planner, operator) apply to both modes.

Future Work and Roadmap

1. Optimizing Retract Amplification

Converted retract pairs into single changelog entries to prevent exponential growth of downstream updates.

2. Planned Feature Enhancements

Full checkpoint recovery for all aggregation metric changes.

Local‑global window support.

Fast emit for event‑time processing.

Broadcast dimension tables.

Mini‑Batch support for more operators (dimension tables, TopN, Join).

Complete Hive SQL compatibility.

3. Business Expansion Goals

Push streaming SQL coverage to 80% of use cases.

Explore productization of stream‑batch integration.

Standardize real‑time data warehousing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

FlinkSQLStreamingCheckpointWindowBatch Integration
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.