Tagged articles
8 articles
Page 1 of 1
DataFunSummit
DataFunSummit
Jun 10, 2023 · Big Data

Performance Optimization of Iceberg Real‑time Data Warehouse and Arctic Enhancements

This article presents a comprehensive overview of Iceberg MOR principles, Arctic‑based performance optimizations, benchmark evaluations using CH‑benchmark, and future roadmap items, highlighting how various file‑type strategies, self‑optimizing mechanisms, and task balancing improve real‑time data lake query efficiency.

ArcticData LakeIceberg
0 likes · 14 min read
Performance Optimization of Iceberg Real‑time Data Warehouse and Arctic Enhancements
ITPUB
ITPUB
Jan 26, 2023 · Big Data

How NetEase’s Arctic Unifies Streaming and Batch with Iceberg for Real‑Time Lakehouse

This article explains the challenges of a Lambda‑architecture data pipeline, introduces NetEase’s Arctic lakehouse built on Apache Iceberg, details its table‑store design, optimization cycles, consistency mechanisms, real‑time features, practical use cases, and future roadmap, highlighting its advantages over similar solutions.

ArcticData IntegrationFlink
0 likes · 14 min read
How NetEase’s Arctic Unifies Streaming and Batch with Iceberg for Real‑Time Lakehouse
DataFunTalk
DataFunTalk
Dec 8, 2022 · Big Data

Arctic: NetEase’s Real-Time Lakehouse System Built on Apache Iceberg

This article introduces NetEase’s Arctic, a real‑time lakehouse system built on Apache Iceberg that unifies streaming and batch processing, explains the challenges of Lambda architecture, details Arctic’s features such as change/base stores, hidden queue, transaction handling, and shares internal practice cases and future roadmap.

Apache IcebergArcticData Lake
0 likes · 12 min read
Arctic: NetEase’s Real-Time Lakehouse System Built on Apache Iceberg
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Oct 26, 2022 · Big Data

Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice

Arctic, NetEase’s streaming lakehouse built on Apache Iceberg, unifies streaming and batch workloads with millisecond‑level latency, Hive compatibility, and built‑in message‑queue support, delivering CDC, upserts and OLAP without a Lambda architecture, as demonstrated by real‑time processing of 2 PB of Hive data for Cloud Music.

Apache IcebergArcticBig Data Architecture
0 likes · 15 min read
Arctic: NetEase's Streaming Lakehouse Service and Hive-Based Stream-Batch Integration Practice
DataFunTalk
DataFunTalk
Feb 12, 2022 · Big Data

NetEase Internal Data Lake Project Arctic: Architecture, Requirements, and Future Roadmap

This article introduces NetEase's internally incubated data lake project Arctic, explains the concept of data lakes, outlines NetEase's specific requirements for a unified streaming‑batch platform, details Arctic's core architecture, storage strategy, data‑merge mechanisms, current achievements, and future development plans.

Apache IcebergArcticBig Data
0 likes · 10 min read
NetEase Internal Data Lake Project Arctic: Architecture, Requirements, and Future Roadmap
DataFunTalk
DataFunTalk
May 16, 2021 · Big Data

Efficient Data Update/Delete and Real‑time Processing in the Arctic Lakehouse System

This article explains the evolution from traditional data warehouses to modern lakehouse architectures, introduces the Arctic system’s dynamic hash tree for fast update/delete, describes file splitting with sequence/offset ordering, and compares copy‑on‑write versus merge‑on‑read techniques for achieving low‑latency analytics.

ArcticBig DataCopy-on-Write
0 likes · 12 min read
Efficient Data Update/Delete and Real‑time Processing in the Arctic Lakehouse System