Tagged articles
15 articles
Page 1 of 1
Baidu Geek Talk
Baidu Geek Talk
Jun 30, 2025 · Big Data

How Baidu’s Turing 3.0 Leverages Apache Iceberg to Boost Data Lake Performance

This article explains how Baidu’s next‑generation data platform Turing 3.0 integrates Apache Iceberg to solve the inefficiencies of the legacy MEG stack, detailing ecosystem components, migration strategies from Hive, table‑level optimizations, and future roadmap for high‑frequency, low‑latency analytics.

Apache IcebergData LakeHive Migration
0 likes · 17 min read
How Baidu’s Turing 3.0 Leverages Apache Iceberg to Boost Data Lake Performance
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 26, 2024 · Big Data

Key Features of Apache Paimon 0.9.0 Release

The Apache Paimon 0.9.0 release introduces production‑ready Branch support, native Iceberg compatibility, a caching catalog for faster OLAP queries, improved Bucketed Append tables with reduced small‑file issues, and full DELETE/UPDATE/MERGE‑INTO capabilities for Append tables, making the system more usable and efficient.

Apache PaimonBig DataBranch
0 likes · 5 min read
Key Features of Apache Paimon 0.9.0 Release
DataFunTalk
DataFunTalk
Feb 20, 2023 · Big Data

Understanding Data Lakes and Their Application at iQIYI: Concepts, Scenarios, and Iceberg Implementation

This article explains the definition of data lakes (public‑cloud and non‑public‑cloud), outlines their key characteristics, presents three typical business scenarios—real‑time event analysis, change‑data analysis, and stream‑batch integration—summarizes required product features, evaluates open‑source lake formats, and details iQIYI's adoption of Apache Iceberg across multiple services to achieve low‑latency, large‑scale, cost‑effective analytics.

Big DataData LakeIceberg
0 likes · 23 min read
Understanding Data Lakes and Their Application at iQIYI: Concepts, Scenarios, and Iceberg Implementation
DataFunSummit
DataFunSummit
Jan 10, 2023 · Big Data

Exploring Iceberg in Huawei Terminal Cloud: Architecture, Features, and Future Plans

This article presents a comprehensive overview of Iceberg's adoption in Huawei Terminal Cloud, covering its architectural overview, key features such as Git‑style data management, real‑time processing, acceleration layers, and future development directions, along with a Q&A session addressing performance and implementation details.

Big DataData LakeFlink
0 likes · 15 min read
Exploring Iceberg in Huawei Terminal Cloud: Architecture, Features, and Future Plans
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 16, 2021 · Big Data

Practical Experience and Optimizations of Apache Iceberg in Tencent’s Big Data Ecosystem

This article reviews the advantages of Apache Iceberg for data lake storage, details Tencent’s custom optimizations and integration with Flink and Spark, and shares multiple real‑world implementations that demonstrate how Iceberg improves data consistency, reduces small‑file overhead, and enables near‑real‑time analytics in large‑scale big‑data environments.

Apache IcebergData LakeFlink
0 likes · 18 min read
Practical Experience and Optimizations of Apache Iceberg in Tencent’s Big Data Ecosystem
Big Data Technology Architecture
Big Data Technology Architecture
Jun 10, 2021 · Big Data

Understanding Apache Iceberg: Design, Architecture, and Its Application at NetEase Cloud Music

This article explains Apache Iceberg’s table‑format design, compares it with Hive’s limitations, details its snapshot‑based architecture and metadata handling, and describes how NetEase Cloud Music leveraged Iceberg to dramatically improve large‑scale log processing performance and stability.

Apache IcebergSparkTable Format
0 likes · 12 min read
Understanding Apache Iceberg: Design, Architecture, and Its Application at NetEase Cloud Music
dbaplus Community
dbaplus Community
Jun 5, 2021 · Big Data

How Flink + Iceberg Transform Data Lakes for Real‑Time Streaming

This article explains the concept of data lakes, outlines a four‑layer open‑source architecture, presents several classic Flink‑Iceberg use cases, details why Iceberg was chosen, and describes the design of Flink’s streaming sink and upcoming community roadmap.

Apache FlinkApache IcebergBig Data
0 likes · 14 min read
How Flink + Iceberg Transform Data Lakes for Real‑Time Streaming
Big Data Technology Architecture
Big Data Technology Architecture
Apr 5, 2021 · Big Data

Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits

This article introduces Apache Iceberg as an open table format for massive analytic datasets, explains its underlying concepts such as schema, partitioning, statistics, and read/write APIs, compares it with Hive Metastore, outlines its ACID commit process, highlights the performance and operational advantages for big‑data workloads, and previews upcoming community features.

ACIDApache IcebergParquet
0 likes · 19 min read
Understanding Apache Iceberg: Table Format Architecture, Comparison with Hive Metastore, and Business Benefits
DataFunTalk
DataFunTalk
Dec 3, 2020 · Big Data

Streaming Data Lake Ingestion with Apache Flink and Apache Iceberg

This article explains how Apache Flink integrates with data lake architectures, especially using Apache Iceberg as a table format, to enable real‑time streaming ingestion, CDC processing, near‑real‑time lambda architectures, and future enhancements like automatic file merging and row‑level deletes.

Apache IcebergData LakeFlink
0 likes · 13 min read
Streaming Data Lake Ingestion with Apache Flink and Apache Iceberg
Big Data Technology Architecture
Big Data Technology Architecture
Nov 27, 2020 · Big Data

Integrating Apache Flink with Data Lakes Using Apache Iceberg: Architecture, Use Cases, and Future Roadmap

This article explains how Apache Flink combines with Apache Iceberg to build unified stream‑batch data lake solutions, covering data lake fundamentals, architectural layers, classic business scenarios, reasons for choosing Iceberg, streaming ingestion design, and upcoming community enhancements.

Apache FlinkApache IcebergTable Format
0 likes · 13 min read
Integrating Apache Flink with Data Lakes Using Apache Iceberg: Architecture, Use Cases, and Future Roadmap
DataFunTalk
DataFunTalk
Oct 9, 2020 · Big Data

NetEase’s Data Lake Iceberg: Challenges, Core Principles, and Practical Implementation

This article examines the pain points of traditional data warehouse platforms, explains the core concepts and advantages of the Iceberg data lake table format, compares it with Metastore, reviews the current Iceberg community ecosystem, and details NetEase’s practical integration with Hive, Impala, and Flink to improve ETL efficiency and support unified batch‑stream processing.

Data LakeETLFlink
0 likes · 13 min read
NetEase’s Data Lake Iceberg: Challenges, Core Principles, and Practical Implementation