Big Data 17 min read

Real‑time Materialized View Practices with Apache Flink: System Analysis, Algorithm Design, and Implementation

This article presents Car Home's experience building a real‑time materialized view system on Apache Flink, detailing system analysis, problem decomposition, a global‑version‑based CDC algorithm, its implementation as a Flink connector, practical deployment results, and remaining challenges such as clock dependency and state size.

HomeTech
HomeTech
HomeTech
Real‑time Materialized View Practices with Apache Flink: System Analysis, Algorithm Design, and Implementation

Materialized views are widely used to pre‑compute complex SQL logic and keep the result set up‑to‑date with incremental updates, thereby avoiding repeated heavy queries. Many databases and OLAP engines support them, and streaming SQL is closely related, making Flink a natural choice for a real‑time materialized‑view solution.

This article shares the practice and exploration of Car Home (referred to as "the company") in building a real‑time materialized view system on Apache Flink, aiming to let users develop Flink Streaming SQL jobs with a batch‑processing mindset.

System Analysis and Problem Decomposition

Flink’s Table & SQL module already provides a mature SQL system, solving most of the real‑time materialized‑view challenges. The remaining core problem is how to generate a semantically complete changelog DataStream (both incremental and full‑history) for each source table without missing or duplicating data.

The problem is broken down into three sub‑problems:

Loading full‑history data.

Loading incremental data.

Integrating incremental data with full‑history data.

Problem Solving and System Implementation

Problem 1 – Incremental Data Reading via Data‑Transfer Platform

Incremental data is read from a unified Kafka topic that the data‑transfer platform writes for MySQL, SQL Server, TiDB, etc. The platform ensures at‑least‑once delivery and provides a total‑order version number for each record.

Problem 2 – Checkpoint‑Supported Full‑History Loading

Two versions of the bulk loader were developed. The first version used a BulkLoadSourceFunction that simply queried the entire source table. It suffered from costly restarts because any failure required re‑loading the whole dataset.

After the release of Flink‑CDC 2.0, the team adopted its checkpoint‑aware approach and built a new BulkLoadSource . This version dramatically improved performance and fault‑tolerance.

Problem 3 – Global‑Version‑Based Lightweight CDC Integration Algorithm

The naive approach caches all full‑history rows and applies incremental updates based on keys, which works for small tables but becomes prohibitive for large tables (e.g., a 12 billion‑row, 120 GB SQL Server table). To overcome this, the team designed a Global Version Based Pause‑free Change‑Data‑Capture algorithm inspired by DBLog.

The algorithm guarantees:

Complete changelog streams with fully‑specified RowKind semantics.

Controllable overhead.

High processing performance.

No reliance on external Flink systems or features.

The core idea is to read both BulkLoadSource (full‑history) and RealtimeChangelogSource (incremental) in parallel, key‑by the primary key, and let a VersionBasedKeyedCoProcess operator merge them. Full‑history rows are cached until the watermark (derived from incremental timestamps) exceeds the row’s search timestamp; then they are either emitted or discarded if newer changelog data has arrived.

Algorithm Principle

The algorithm simultaneously consumes bulk and realtime sources, using three key fields:

SearchTs – the timestamp when the full‑history row was queried from the source.

Watermark – generated from the timestamp embedded in incremental records.

Version – a total‑order version number (full‑history rows have version 0).

The KeyedCoProcess buffers full‑history rows and releases them only after the watermark surpasses their SearchTs. If a matching changelog record arrives earlier, the buffered row is dropped, ensuring a pause‑free, lock‑free integration.

Algorithm Implementation

The algorithm is packaged as a Flink connector named Estuary . It uses DataStreamScanProvider to chain internal operators:

BulkLoadSource / ChangelogSource – read and normalize raw data.

BulkNormalize / ChangelogNormalize – add runtime metadata and handle primary‑key semantics.

WatermarkGenerator – custom watermark logic for the algorithm.

VersionBasedKeyedCoProcess – core merging logic with full RowKind semantics.

During processing, the bulk stage checks whether a higher‑version record has already been handled; if not, the row is stored in state and an event‑time timer (SearchTs + fixed delay) is registered. If a higher‑version record arrives before the timer fires, the bulk row is discarded. After the bulk phase ends, state is cleared once its TTL expires, keeping overhead bounded.

For sinks that support upsert, an ultra‑lightweight mode further reduces overhead, enabling near‑zero‑cost full‑history + incremental synchronization.

Real‑time Materialized View Practice

After releasing the MVP, the team piloted the solution with a user who had three tables (TiDB, SQL Server, MySQL) totaling billions of rows and complex join‑deduplication logic. The original batch pipeline produced a T+1 result with day‑level latency; the Flink‑based pipeline reduced latency to about 10 seconds while maintaining accuracy.

Resource usage dropped dramatically: after the initial load, the job consumed only one CPU core on YARN, with typical CPU utilization below 20 %.

Data‑Lake Scenario Optimization

By using a dedicated source table, the team enabled one‑click synchronization of historical and incremental data to an Iceberg data lake, greatly improving data freshness.

Limitations and Shortcomings

Implicit Server‑Clock Dependency

Both SearchTs and Watermark rely on the server’s system clock rather than a dedicated time‑oracle. Severe clock skew can make watermarks unreliable, potentially causing logical errors.

Consistency and Transactions

The current implementation only guarantees eventual consistency; there is no transactional guarantee across multiple sources. In cases where one source lags by hours while another is up‑to‑date, intermediate join results may become visible to downstream systems.

Achieving stronger consistency would require a cross‑source transaction mechanism, such as an epoch‑based commit coordinated by checkpoints, but this re‑introduces clock dependency.

Update Amplification

When a left‑side row joins with many right‑side rows, a single update can generate a large number of downstream updates, leading to amplification.

State Size

During bulk loading, state can reach 1.5 GB for a 100 million‑row table. Future work includes notifying KeyedCoProcess about completed keys so they can be evicted earlier, reducing state pressure.

Conclusion and Outlook

The article analyzed challenges of building a Flink‑based materialized view system, introduced a global‑version‑based CDC algorithm, described its implementation, and demonstrated substantial latency and resource improvements in production. Although early‑stage, the approach shows great promise, and ongoing iterations will address current limitations, making the solution more robust for broader scenarios.

AlgorithmBig DataFlinkCDCreal-time data processingmaterialized viewStreaming SQL
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.