How ByteDance Optimized Flink SQL for Real‑World Streaming at Scale
This article details ByteDance's practical experience with Apache Flink, covering SQL extensions, a visual SQL platform, performance tweaks such as window mini‑batching and custom windows, join and checkpoint recovery improvements, stream‑batch integration experiments, and future roadmap plans.
Overall Introduction
ByteDance began using Apache Flink after Blink was open‑sourced in December 2018. By October 2019, a streaming SQL platform based on Flink 1.9 and the Blink planner was deployed internally, revealing several interesting use cases and bugs.
1. Flink 1.9 SQL Extensions
Because Flink 1.9 lacked DDL support, ByteDance added extensions for the following statements:
CREATE TABLE
CREATE VIEW
CREATE FUNCTION
ADD RESOURCE
The extensions also enabled DDL‑based watermark definitions.
To address the “SQL cannot express complex business logic” feedback, a set of RPC‑based dimension tables and sinks were introduced, allowing SQL to read/write RPC services, FaaS, Redis, Abase, Bytable, and ByteSQL.
Additional internal connectors were built:
Source: RocketMQ
Sink: RocketMQ, ClickHouse, Doris, LogHouse, Redis, Abase, Bytable, ByteSQL, RPC, Print, Metrics
Corresponding data formats (PB, Binlog, Bytes) were also implemented.
2. Visual SQL Platform
A web‑based SQL platform was launched, offering SQL editing, parsing, debugging, custom UDF/Connector development, version control, and job management.
Practice Optimizations
1. Window Performance Optimizations
Mini‑Batch for Windows : By batching records before state access, CPU usage dropped by 20‑30% in RocksDB‑backed state backends.
Custom Window Types : Added user‑defined windows to support scenarios like hourly UV/GMV calculations for live streaming, which standard tumbling, sliding, and session windows could not satisfy.
-- my_window 为自定义的窗口,满足特定的划分方式
SELECT
room_id,
COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
room_id,
my_window(ts, INTERVAL '1' HOURS)2. Window Offset Support
Implemented offset parameters to define natural week windows starting on Monday rather than the Unix epoch Thursday.
SELECT
room_id,
COUNT(DISTINCT user_id)
FROM MySource
GROUP BY
room_id,
TUMBLE(ts, INTERVAL '7' DAY, INTERVAL '3', DAY)3. Dimension Table Optimizations
Delayed Join : When a dimension table join fails, the record is cached locally and retried according to configurable attempts.
Dimension Table Keyby : Exposed a hash‑based key configuration to partition dimension‑table joins, reducing cache pressure and improving hit rates.
Planned enhancements include broadcast dimension tables and Mini‑Batch support for high‑QPS dimension joins.
4. Join Optimizations
Analyzed three Flink join types (Interval, Regular, Temporal Table Function) and their drawbacks, then applied internal fixes to mitigate out‑of‑order data, retract amplification, and watermark stagnation.
5. Checkpoint Recovery Enhancements
Identified two main reasons for checkpoint recovery failures: operator ID changes and state schema mismatches after aggregation metric changes. Implemented a migration path by making old and new serializers aware of each other, allowing state compatibility checks, migration handling, and proper initialization for null aggregation results.
Stream‑Batch Integration Exploration
ByteDance investigated a unified SQL layer for both streaming and batch workloads. While some session‑window and over‑window features differ between the two, they affect less than 10% of use cases, making a unified approach feasible.
The proposed architecture processes streaming data with Flink, writes results to a message queue for downstream streaming jobs, and writes batch‑ready data to Hive via Flink, eliminating the need for separate engines and duplicated UDFs.
Business Benefits
Unified SQL for stream and batch reduces development and maintenance effort.
Shared UDFs across workloads.
Consistent engine simplifies learning and architecture upkeep.
Optimizations (planner, operator) apply to both modes.
Future Work and Roadmap
1. Optimizing Retract Amplification
Converted retract pairs into single changelog entries to prevent exponential growth of downstream updates.
2. Planned Feature Enhancements
Full checkpoint recovery for all aggregation metric changes.
Local‑global window support.
Fast emit for event‑time processing.
Broadcast dimension tables.
Mini‑Batch support for more operators (dimension tables, TopN, Join).
Complete Hive SQL compatibility.
3. Business Expansion Goals
Push streaming SQL coverage to 80% of use cases.
Explore productization of stream‑batch integration.
Standardize real‑time data warehousing.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
