Understanding Namenode Metadata Persistence: FsImage, EditLog, and SecondaryNamenode
This article explains how Hadoop's Namenode persists metadata using FsImage and EditLog, describes the checkpoint process during startup, and details the role of SecondaryNamenode in merging these files for efficient recovery, while also encouraging readers to like and share the content.
We all know that the Namenode is used to store metadata, not the actual data.
So how is its metadata persisted?
FsImage
The filesystem image file, called FsImage, includes the mapping of files and blocks as well as filesystem property information.
During DataNode startup, it first registers block information with the Namenode, and this block information is stored in the Namenode's FsImage.
EditLog
Every change to the filesystem, such as adding or deleting files, is written to the EditLog.
Similarly, modifications to the replication factor configuration are also written to the EditLog file.
Both EditLog and FsImage files are stored in the local filesystem path.
The Namenode keeps in memory the entire filesystem image and block‑mapping information.
Metadata can be merged, so a 4 GB memory Namenode is sufficient to store massive numbers of files and directories.
What happens during Namenode startup?
1. Read FsImage and EditLog files from disk.
2. Apply all operations recorded in the EditLog to the FsImage, producing a new FsImage file – this operation is called a checkpoint.
3. Create a new empty EditLog file.
Namenode startup checkpoint
Interaction between Namenode, FsImage, and EditLog during runtime
SecondaryNamenode
Many people think the role of SecondaryNamenode is simply a backup for the Namenode, allowing quick recovery when the Namenode fails.
In fact, the important role of SecondaryNamenode is to periodically merge FsImage and EditLog files.
Consider a scenario where the Namenode crashes. To recover, we need to read and merge the EditLog with the FsImage. If the EditLog is extremely large, recovery can take a very long time, resulting in poor fault‑tolerance.
Therefore, SecondaryNamenode periodically merges FsImage and EditLog, replaces the old FsImage on the Namenode, and creates a new EditLog, ensuring that the files on SecondaryNamenode contain the most recent information for fast recovery.
Welcome likes, bookmarks, and shares – please give the article a thumbs‑up and share it with your friends.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
