Big Data 13 min read

Why Incremental Computing Is Replacing Lambda Architecture in Modern Big Data Platforms

This interview with Yunqi Technology CTO Guan Tao explains how the traditional Lambda architecture’s triple‑system complexity drives costs and operational pain, and why the company’s General Incremental Computing (GIC) approach offers a unified, cost‑effective Kappa‑style solution for real‑time, batch, and interactive analytics.

DataFunSummit

Jul 20, 2025

Why Incremental Computing Is Replacing Lambda Architecture in Modern Big Data Platforms

Many enterprises build costly "standard" big‑data platforms using Spark, Flink, and various query components, only to end up duplicating development across three systems and spending most of their time on operations. The traditional Lambda architecture—separate batch, stream, and query layers—leads to soaring costs, uncontrolled complexity, and data‑consistency challenges.

Yunqi Technology CTO Guan Tao argues that the root problem is a "patch‑work" mindset. He introduces the company’s General Incremental Computing (GIC) concept, which replaces multiple systems with a single unified engine that handles all scenarios.

Q: What motivated Yunqi to challenge the Lambda architecture and adopt incremental computing?

Guan explains that technology evolution follows three stages: single‑point breakthroughs, combination phases (e.g., Lambda), and finally integration. When core technologies mature, systems naturally evolve toward integration.

Q: How does incremental computing differ from traditional batch and stream processing?

Incremental computing continuously processes data changes instead of periodically recomputing full datasets, blurring the line between batch and stream, simplifying the tech stack, and offering elastic resource tuning for optimal cost‑performance.

Q: What concrete business value does incremental computing deliver?

Three key benefits are highlighted:

Real‑time processing with dynamic resource adjustment (e.g., 30‑minute delivery in retail).

Architecture simplification and cost reduction—single‑engine Kappa replaces three systems, cutting architecture, resource, and storage costs each to roughly one‑third.

Unified development—one codebase and modeling approach serve both real‑time and offline needs, reducing development effort to about one‑third.

Q: Can you give a vivid case study?

In an IoT scenario for Chang’an Auto, moving from Spark+Flink+Doris to incremental computing reduced overall platform cost by 75‑80%, cut latency to minutes, and lowered operational complexity by 60%.

Q: Why have mature products not emerged earlier?

Two fundamental conflicts hindered earlier attempts: the clash between proactive (batch) and reactive (stream) computation paradigms, and the trade‑off between throughput and latency, creating a “data triangle” that Lambda architecture cannot resolve.

Q: How does Yunqi Lakehouse achieve high performance with a single SQL for both stream and batch?

Incremental computing framework that unifies semantics across scenarios.

Proprietary C++ native engine optimized with hardware‑level vectorization, surpassing industry benchmarks.

Shared‑Everything architecture that decouples storage, scheduling, compute, write, and compression, allowing orthogonal scaling for latency‑critical streaming and high‑throughput batch workloads.

Q: What is the migration path from Spark/Flink to incremental computing?

Yunqi’s Lakehouse supports Spark‑compatible syntax and MySQL protocol, allowing a “smooth migration” where only CREATE TABLE statements need to be changed to CREATE DYNAMIC TABLE, preserving existing logic and data models.

Q: How does incremental computing relate to AI workloads?

It provides a unified engine for AI data preprocessing, avoids architecture fragmentation, and meets real‑time AI requirements such as feature updates and dynamic model tuning.

Q: What is the future outlook for compute architectures?

Lambda will rapidly evolve into Kappa, with incremental computing becoming mainstream.

Data infrastructure will focus on AI‑ready pipelines, handling massive heterogeneous data for model training and inference.

Multiple divergent paths will persist, requiring continuous monitoring of emerging trends.

Overall, as real‑time processing becomes a necessity and AI moves to production, incremental computing serves as the bridge, delivering lower cost, higher timeliness, and simpler architectures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Engineering Lambda architecture Kappa architecture incremental computing

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.