Tagged articles

table format

16 articles · Page 1 of 1

Jun 16, 2026 · Big Data

Deep Dive: Multimodal Data Lake Formats – Paimon vs. Hudi vs. Iceberg

This article analytically compares three open table‑format projects—Paimon, Hudi, and Iceberg—examining how each addresses multimodal data lake challenges such as massive volume, sparse access patterns, and combined scalar‑vector retrieval, and provides concrete feature breakdowns and selection guidance.

BLOBHudiIceberg

0 likes · 11 min read

Deep Dive: Multimodal Data Lake Formats – Paimon vs. Hudi vs. Iceberg

Baidu Geek Talk

Jun 30, 2025 · Big Data

How Baidu’s Turing 3.0 Leverages Apache Iceberg to Boost Data Lake Performance

This article explains how Baidu’s next‑generation data platform Turing 3.0 integrates Apache Iceberg to solve the inefficiencies of the legacy MEG stack, detailing ecosystem components, migration strategies from Hive, table‑level optimizations, and future roadmap for high‑frequency, low‑latency analytics.

Apache IcebergData LakeHive Migration

0 likes · 17 min read

How Baidu’s Turing 3.0 Leverages Apache Iceberg to Boost Data Lake Performance

Big Data Technology & Architecture

Sep 26, 2024 · Big Data

Key Features of Apache Paimon 0.9.0 Release

The Apache Paimon 0.9.0 release introduces production‑ready Branch support, native Iceberg compatibility, a caching catalog for faster OLAP queries, improved Bucketed Append tables with reduced small‑file issues, and full DELETE/UPDATE/MERGE‑INTO capabilities for Append tables, making the system more usable and efficient.

Apache PaimonBig DataBucketed Append

0 likes · 5 min read

Key Features of Apache Paimon 0.9.0 Release

DataFunSummit

Mar 10, 2023 · Big Data

Interview on Data Lake and Lakehouse: Current Applications, Challenges, and Evolution

This interview with NetEase’s data‑lake technology manager explores the distinction between data lakes and lakehouses, the evolution of table‑format technologies such as Iceberg, Hudi and Delta Lake, their maturity across key capabilities, and the practical adoption challenges faced by enterprises.

Data LakeDelta LakeHudi

0 likes · 14 min read

Interview on Data Lake and Lakehouse: Current Applications, Challenges, and Evolution

DataFunTalk

Feb 20, 2023 · Big Data

Understanding Data Lakes and Their Application at iQIYI: Concepts, Scenarios, and Iceberg Implementation

This article explains the definition of data lakes (public‑cloud and non‑public‑cloud), outlines their key characteristics, presents three typical business scenarios—real‑time event analysis, change‑data analysis, and stream‑batch integration—summarizes required product features, evaluates open‑source lake formats, and details iQIYI's adoption of Apache Iceberg across multiple services to achieve low‑latency, large‑scale, cost‑effective analytics.

Big DataData LakeIceberg

0 likes · 23 min read

Understanding Data Lakes and Their Application at iQIYI: Concepts, Scenarios, and Iceberg Implementation

DataFunSummit

Jan 10, 2023 · Big Data

Exploring Iceberg in Huawei Terminal Cloud: Architecture, Features, and Future Plans

This article presents a comprehensive overview of Iceberg's adoption in Huawei Terminal Cloud, covering its architectural overview, key features such as Git‑style data management, real‑time processing, acceleration layers, and future development directions, along with a Q&A session addressing performance and implementation details.

Big DataData LakeFlink

0 likes · 15 min read

Exploring Iceberg in Huawei Terminal Cloud: Architecture, Features, and Future Plans

Big Data Technology & Architecture

Jul 7, 2022 · Big Data

Deep Dive into Apache Iceberg Core Features and Flink Integration

This article explains Apache Iceberg’s architecture, core capabilities such as time‑travel, fast scans, delete handling, and schema evolution, and provides a step‑by‑step guide for configuring Flink to use Iceberg with Hive and Hadoop catalogs, including DDL commands and streaming queries.

Apache IcebergBig DataData Lake

0 likes · 22 min read

Deep Dive into Apache Iceberg Core Features and Flink Integration

Big Data Technology & Architecture

Jun 16, 2021 · Big Data

Practical Experience and Optimizations of Apache Iceberg in Tencent’s Big Data Ecosystem

This article reviews the advantages of Apache Iceberg for data lake storage, details Tencent’s custom optimizations and integration with Flink and Spark, and shares multiple real‑world implementations that demonstrate how Iceberg improves data consistency, reduces small‑file overhead, and enables near‑real‑time analytics in large‑scale big‑data environments.

Apache IcebergData LakeFlink

0 likes · 18 min read

Practical Experience and Optimizations of Apache Iceberg in Tencent’s Big Data Ecosystem

Big Data Technology Architecture

Jun 10, 2021 · Big Data

Understanding Apache Iceberg: Design, Architecture, and Its Application at NetEase Cloud Music

This article explains Apache Iceberg’s table‑format design, compares it with Hive’s limitations, details its snapshot‑based architecture and metadata handling, and describes how NetEase Cloud Music leveraged Iceberg to dramatically improve large‑scale log processing performance and stability.

Apache IcebergSparkmetadata management

0 likes · 12 min read

Understanding Apache Iceberg: Design, Architecture, and Its Application at NetEase Cloud Music

dbaplus Community

Jun 5, 2021 · Big Data

How Flink + Iceberg Transform Data Lakes for Real‑Time Streaming

This article explains the concept of data lakes, outlines a four‑layer open‑source architecture, presents several classic Flink‑Iceberg use cases, details why Iceberg was chosen, and describes the design of Flink’s streaming sink and upcoming community roadmap.

Apache FlinkApache IcebergBig Data

0 likes · 14 min read

How Flink + Iceberg Transform Data Lakes for Real‑Time Streaming

DataFunTalk

Apr 26, 2021 · Big Data

Detailed Design and Practical Application of Apache Iceberg at NetEase Cloud Music

This article explains the motivations behind Apache Iceberg, its design principles such as snapshot and MVCC, compares it with Hive, and describes how NetEase Cloud Music adopted Iceberg to improve metadata handling, query performance, and operational stability for massive daily log data.

Apache IcebergBig DataData Lake

0 likes · 13 min read

Detailed Design and Practical Application of Apache Iceberg at NetEase Cloud Music

Big Data Technology Architecture

Apr 5, 2021 · Big Data

Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits

This article introduces Apache Iceberg as an open table format for massive analytic datasets, explains its underlying concepts such as schema, partitioning, statistics, and read/write APIs, compares it with Hive Metastore, outlines its ACID commit process, highlights the performance and operational advantages for big‑data workloads, and previews upcoming community features.

ACIDApache IcebergMetadata

0 likes · 19 min read

Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits

Big Data Technology & Architecture

Feb 2, 2021 · Big Data

An Introduction to Apache Iceberg: Features, Spark & Flink Integration, and Real‑World Use Cases

This article provides a comprehensive overview of Apache Iceberg, covering its origins, key features, practical Spark and Flink code examples, notable deployments at Alibaba and Tencent, and its future role as a universal table format for big‑data analytics.

Apache IcebergData LakeFlink

0 likes · 9 min read

An Introduction to Apache Iceberg: Features, Spark & Flink Integration, and Real‑World Use Cases

DataFunTalk

Dec 3, 2020 · Big Data

Streaming Data Lake Ingestion with Apache Flink and Apache Iceberg

This article explains how Apache Flink integrates with data lake architectures, especially using Apache Iceberg as a table format, to enable real‑time streaming ingestion, CDC processing, near‑real‑time lambda architectures, and future enhancements like automatic file merging and row‑level deletes.

Apache IcebergData LakeFlink

0 likes · 13 min read

Streaming Data Lake Ingestion with Apache Flink and Apache Iceberg

Big Data Technology Architecture

Nov 27, 2020 · Big Data

Integrating Apache Flink with Data Lakes Using Apache Iceberg: Architecture, Use Cases, and Future Roadmap

This article explains how Apache Flink combines with Apache Iceberg to build unified stream‑batch data lake solutions, covering data lake fundamentals, architectural layers, classic business scenarios, reasons for choosing Iceberg, streaming ingestion design, and upcoming community enhancements.

Apache FlinkApache Icebergtable format

0 likes · 13 min read

Integrating Apache Flink with Data Lakes Using Apache Iceberg: Architecture, Use Cases, and Future Roadmap

DataFunTalk

Oct 9, 2020 · Big Data

NetEase’s Data Lake Iceberg: Challenges, Core Principles, and Practical Implementation

This article examines the pain points of traditional data warehouse platforms, explains the core concepts and advantages of the Iceberg data lake table format, compares it with Metastore, reviews the current Iceberg community ecosystem, and details NetEase’s practical integration with Hive, Impala, and Flink to improve ETL efficiency and support unified batch‑stream processing.

Data LakeETLFlink

0 likes · 13 min read

NetEase’s Data Lake Iceberg: Challenges, Core Principles, and Practical Implementation