Big Data 16 min read

Apache Hudi Overview: Design Principles, Table Architecture, and Read/Write Processes

This article provides a comprehensive overview of Apache Hudi, covering its storage reliance on HDFS, core design principles, table architecture, timeline management, file and index structures, as well as detailed read and write workflows for both Copy‑On‑Write and Merge‑On‑Read table types.

Big Data Technology & Architecture

Feb 8, 2022

Apache Hudi Overview: Design Principles, Table Architecture, and Read/Write Processes

Introduction

Apache Hudi relies on HDFS for storage and supports massive data sets. It provides two primitives—Update/Delete records and Change streams—to solve unified streaming and batch storage.

Design Principles

Streaming read/write : Hudi adopts database design principles, offering indexing to map record keys to file locations and tracks metadata for incremental streams.

Self‑management : Hudi balances write freshness and query performance with three query types (real‑time snapshot, incremental stream, and pure columnar view) and features like automatic parallelism optimization and rollback.

Everything is a log : Hudi uses an append‑only, log‑structured storage model suitable for cloud environments.

Key‑value data model : Tables are modeled as key‑value pairs, where each record has a unique key and optional partition path.

Hudi Table Design

Three main components of a Hudi table:

Ordered timeline metadata (similar to a transaction log).

Layered data files.

Indexes (multiple implementations).

Important functions include upsert and MVCC.

Timeline

The Timeline abstracts commits, each represented by a HoodieInstant containing operation type, instant time, and status. Instant types include COMMIT, CLEAN, DELTA_COMMIT, COMPACTION, ROLLBACK, SAVEPOINT, each with states REQUESTED, INFLIGHT, COMPLETED.

Data Files

Hudi organizes tables in a DFS folder hierarchy. Each partition contains file groups identified by FileID; each file slice consists of a base Parquet file and optional log files. Base files store a Bloom filter of record keys; log files are Avro‑encoded blocks.

Index Design

Hudi provides three index implementations (HBaseIndex, HoodieBloomIndex, InMemoryHashIndex) to map record keys to file IDs, supporting global and non‑global indexes.

Table Types

Copy‑On‑Write (COW)

Writes create new base Parquet files; updates rewrite the entire file slice, inserts are packed into small files up to a configured size.

Merge‑On‑Read (MOR)

Writes first append to log files; background compaction merges logs with base files. MOR supports snapshot, incremental, and read‑optimized query modes.

Read and Write Processes

Read

Snapshot read : Reads the latest file slice of each file group (COW reads Parquet, MOR reads Parquet + log).

Incremental read : Consumes commits between a start and end instant.

Streaming read : Flink writer provides real‑time incremental subscription.

Write

Supported operations: UPSERT (default, with indexing), INSERT (no index, higher throughput), BULK_INSERT (sorted, for large initial loads).

UPSERT flow for COW : De‑duplicate records, build index, locate updates, rewrite base files, or create new file groups for inserts.

UPSERT flow for MOR : Similar indexing, updates append to log files, inserts may create base files or log files depending on index capability.

INSERT flows for both COW and MOR skip indexing and write directly to base files or log files.

Conclusion

The article summarizes personal research on Hudi up to version 0.11, noting that details may evolve with the community.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Lake Apache Hudi Copy-on-Write Table Design Read/Write Merge-on-Read

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.