Big Data 4 min read

Understanding Namenode Metadata Persistence: FsImage, EditLog, and SecondaryNamenode

This article explains how Hadoop's Namenode persists metadata using FsImage and EditLog, describes the checkpoint process during startup, and details the role of SecondaryNamenode in merging these files for efficient recovery, while also encouraging readers to like and share the content.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Understanding Namenode Metadata Persistence: FsImage, EditLog, and SecondaryNamenode

We all know that the Namenode is used to store metadata, not the actual data.

So how is its metadata persisted?

FsImage

The filesystem image file, called FsImage, includes the mapping of files and blocks as well as filesystem property information.

During DataNode startup, it first registers block information with the Namenode, and this block information is stored in the Namenode's FsImage.

EditLog

Every change to the filesystem, such as adding or deleting files, is written to the EditLog.

Similarly, modifications to the replication factor configuration are also written to the EditLog file.

Both EditLog and FsImage files are stored in the local filesystem path.

The Namenode keeps in memory the entire filesystem image and block‑mapping information.

Metadata can be merged, so a 4 GB memory Namenode is sufficient to store massive numbers of files and directories.

What happens during Namenode startup?

1. Read FsImage and EditLog files from disk.

2. Apply all operations recorded in the EditLog to the FsImage, producing a new FsImage file – this operation is called a checkpoint.

3. Create a new empty EditLog file.

Namenode startup checkpoint

Interaction between Namenode, FsImage, and EditLog during runtime

SecondaryNamenode

Many people think the role of SecondaryNamenode is simply a backup for the Namenode, allowing quick recovery when the Namenode fails.

In fact, the important role of SecondaryNamenode is to periodically merge FsImage and EditLog files.

Consider a scenario where the Namenode crashes. To recover, we need to read and merge the EditLog with the FsImage. If the EditLog is extremely large, recovery can take a very long time, resulting in poor fault‑tolerance.

Therefore, SecondaryNamenode periodically merges FsImage and EditLog, replaces the old FsImage on the Namenode, and creates a new EditLog, ensuring that the files on SecondaryNamenode contain the most recent information for fast recovery.

Welcome likes, bookmarks, and shares – please give the article a thumbs‑up and share it with your friends.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

bigdataHadoopNameNodeEditLogFsImageSecondaryNamenode
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.