Tagged articles
5 articles
Page 1 of 1
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 8, 2023 · Big Data

Comprehensive Guide to Apache Paimon and Advanced Flink Integration

This article provides an in‑depth overview of Apache Paimon as a streaming lakehouse, explains its core features, file layout, consistency guarantees, and offers detailed guidance on integrating and tuning Paimon with Apache Flink for both write and read performance, multi‑writer concurrency, table management, and bucket rescaling.

Apache PaimonBig DataData Lake
0 likes · 23 min read
Comprehensive Guide to Apache Paimon and Advanced Flink Integration
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 10, 2023 · Big Data

How Open‑Source Big Data 3.0 Is Redefining Real‑Time, Serverless, and AI‑Driven Analytics

The talk outlines Alibaba Cloud's open‑source big data platform evolution to version 3.0, highlighting the streaming lakehouse architecture, full serverless transformation, and AI‑enhanced operations that together enable real‑time analytics, higher performance, and smarter data management.

Apache FlinkPaimonstreaming lakehouse
0 likes · 15 min read
How Open‑Source Big Data 3.0 Is Redefining Real‑Time, Serverless, and AI‑Driven Analytics
DataFunTalk
DataFunTalk
Apr 7, 2023 · Big Data

Introducing Apache Paimon: An Open‑Source Streaming Lakehouse Storage Engine

Apache Paimon is an open‑source streaming data lake storage system that combines LSM‑based real‑time updates, open file formats, and deep integration with Flink, Spark, and Trino to deliver high‑throughput ingestion, low‑latency queries, and unified batch‑stream processing for modern big‑data workloads.

Apache PaimonBig DataFlink
0 likes · 7 min read
Introducing Apache Paimon: An Open‑Source Streaming Lakehouse Storage Engine
Big Data Technology & Architecture
Big Data Technology & Architecture
Mar 30, 2023 · Big Data

Apache Paimon (Incubating): A Streaming Lakehouse Storage Project Overview

Apache Paimon, newly incubated by the Apache Software Foundation, combines Flink's real‑time streaming capabilities with open lakehouse storage formats, offering high‑throughput, low‑latency data ingestion, partial‑update merges, and seamless integration with engines like Flink, Spark, and Trino for unified batch and streaming analytics.

Apache PaimonBig DataData Lake
0 likes · 7 min read
Apache Paimon (Incubating): A Streaming Lakehouse Storage Project Overview
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Oct 26, 2022 · Big Data

Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice

Arctic, NetEase’s streaming lakehouse built on Apache Iceberg, unifies streaming and batch workloads with millisecond‑level latency, Hive compatibility, and built‑in message‑queue support, delivering CDC, upserts and OLAP without a Lambda architecture, as demonstrated by real‑time processing of 2 PB of Hive data for Cloud Music.

Apache IcebergArcticBig Data Architecture
0 likes · 15 min read
Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice