Tag

Streaming Lakehouse

1 views collected around this technical thread.

DataFunTalk
DataFunTalk
Apr 7, 2023 · Big Data

Introducing Apache Paimon: An Open‑Source Streaming Lakehouse Storage Engine

Apache Paimon is an open‑source streaming data lake storage system that combines LSM‑based real‑time updates, open file formats, and deep integration with Flink, Spark, and Trino to deliver high‑throughput ingestion, low‑latency queries, and unified batch‑stream processing for modern big‑data workloads.

Apache PaimonFlinkLSM
0 likes · 7 min read
Introducing Apache Paimon: An Open‑Source Streaming Lakehouse Storage Engine
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Oct 26, 2022 · Big Data

Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice

Arctic, NetEase’s streaming lakehouse built on Apache Iceberg, unifies streaming and batch workloads with millisecond‑level latency, Hive compatibility, and built‑in message‑queue support, delivering CDC, upserts and OLAP without a Lambda architecture, as demonstrated by real‑time processing of 2 PB of Hive data for Cloud Music.

Apache IcebergArcticHive Compatibility
0 likes · 15 min read
Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice