State as Database in Apache Flink: QueryableState and Savepoint Processor API
The article examines how Apache Flink's state management features, including QueryableState and the upcoming Savepoint Processor API, can serve as a lightweight database for real‑time applications, discussing their advantages, limitations, and practical usage scenarios.
Stateful computation is essential for fault tolerance and data consistency in real‑time processing. Popular engines such as Google Dataflow, Flink, Spark Structured Streaming and Kafka Streams provide built‑in state support, prompting the question whether state can replace a traditional database.
In the Flink community two lines of work address this: QueryableState, which enables runtime queries of job state, and the upcoming Savepoint Processor API, which allows offline inspection and modification of state dump files (savepoints).
QueryableState was introduced in Flink 1.2 (2017) and lets users query state without relying on external storage, but it remains in beta, has limited functionality and is not production‑ready.
The article lists advantages of using state as a database: lower data latency, stronger exactly‑once consistency guarantees, and resource savings by avoiding external serialization and network transfer.
It also enumerates disadvantages: insufficient SLA compared with mature databases, potential job instability from heavy ad‑hoc queries, limited storage size leading to OOM or checkpoint timeouts, only basic query capabilities, and read‑only access at runtime (modifications require the Savepoint Processor API).
The Savepoint Processor API, described in FLIP‑42, treats savepoints as offline databases, allowing analysis, debugging, auditing, and creation of initial state for new applications. It supports operations such as changing parallelism, large schema migrations, and fixing corrupt state.
Savepoints consist of state for multiple operators; each operator’s state can be viewed as a table, with namespaces mapping to UIDs and tables mapping to state entries. Different state backends correspond to different storage engines.
Database
Savepoint
Namespace
Uid
Table
State
Example tables illustrate how keyed state can aggregate scores and times per user group, while operator state can hold total scores and times.
user_id
user_name
user_group
score
1001
Paul
A
5,000
1002
Charlotte
A
3,600
1003
Kate
C
2,000
1004
Robert
B
3,900
In conclusion, treating Flink state as a database is a trend that complements rather than replaces traditional databases, with online access provided by QueryableState and offline access/modification provided by the Savepoint Processor API.
Big Data Technology Architecture
Exploring Open Source Big Data and AI Technologies
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.