From Real-Time Data Analytics to Real-Time AI: Flink Forward Asia 2025 Highlights
The Flink Forward Asia 2025 conference in Singapore showcased Apache Flink's evolution with new AI‑driven projects such as Flink Agents, the integration of AI Functions in Flink 2.1, the disaggregated state management architecture of Flink 2.0, and complementary lakehouse technologies like Paimon and Fluss, underscoring the platform's role as the real‑time backbone for modern AI applications.
From Real‑Time Data Analytics to Real‑Time AI, Flink Ecosystem Embraces AI
At Flink Forward Asia 2025, organized by Alibaba Cloud in Singapore, Wang Feng (Apache Flink link community founder, Apache Paimon PMC member) presented the talk “From Real‑Time Data Analytics to Real‑Time AI”. He highlighted the rise of Agentic AI and the need for system‑triggered AI agents in scenarios such as online transactions, website clicks, vehicle status, and IoT events, which demand large‑scale, stable real‑time processing.
Flink Agents is a new sub‑project designed for system‑triggered AI agents. Built on Flink’s streaming engine, it inherits Flink’s distributed, real‑time processing capabilities and adds abstractions for LLM, memory, tools, prompts, dynamic execution plans, looping, shared state, and observability. The project is contributed by Alibaba Cloud, Confluent, Ververica, LinkedIn, and aims for an MVP release around September.
Apache Flink 2.1 officially integrates AI Function , allowing models to be registered as metadata objects and invoked via the built‑in ML_PREDICT function in Flink SQL, enabling end‑to‑end real‑time data cleaning, analysis, and AI inference.
Flink 2.0: Disaggregated State Management and Cloud‑Native Architecture
Flink 2.0 introduces a disaggregated state management architecture that separates state storage from compute, leveraging cheap object storage for shared data. This design improves resource scheduling, scalability, and fault‑tolerance, solving long‑standing issues of snapshot overhead and high cost of state‑compute coupling. The research paper “Disaggregated State Management in Apache Flink® 2.0” has been accepted by VLDB 2025.
Paimon: Multimodal Unified Lake Storage for the AI Era
Apache Paimon, a streaming‑batch unified storage system, integrates with Flink to form a streaming lakehouse. With Iceberg V3’s Deletion Vectors, Paimon can sync Iceberg data in real time while maintaining minute‑level query latency. It also supports the Lance file format for efficient large‑blob storage, catering to audio‑video and other unstructured data. Paimon processes hundreds of petabytes internally at Alibaba and is used by companies such as Vivo, Xiaomi, ByteDance, and Shopee.
Alibaba Cloud has fully managed Paimon in its DLF product, achieving over 30% storage cost reduction and more than 2× query performance improvement. The Paimon Catalog entered public beta in Singapore and Jakarta.
Fluss: Streaming Table Storage System for Real‑Time Analytics and AI
Fluss, an open‑source streaming table storage system from Alibaba, combines columnar storage with streaming updates and integrates deeply with Flink and lakehouse formats like Paimon and Iceberg. It reduces real‑time data warehouse construction costs and boosts development efficiency through unified batch and streaming capabilities, columnar storage, and partition pruning. Since its open‑source debut in December 2024, Fluss has attracted contributors from ByteDance, Ant Financial, Xiaomi, eBay, Tencent, and others, and was donated to the Apache Software Foundation in June 2025.
Industry Perspective
Forrester Vice President Mike Gualtieri noted that Apache Flink serves as the real‑time central nervous system for enterprises building AI‑enabled applications, enabling event‑driven architectures and real‑time AI agents.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
