Tagged articles

NameNode

20 articles · Page 1 of 1

Dec 19, 2025 · Big Data

Why Did Our HDFS Standby NameNode Crash? A Deep Dive into Block Recovery Bugs

A recent HDFS outage caused the Standby and Observer NameNodes to crash after heavy client load triggered block recovery failures, exposing a bug in commitBlockSynchronization that leads to mismatched block IDs and edit‑log inconsistencies, which can be fixed by applying HDFS‑17861.

BlockRecoveryCrashHDFS

0 likes · 15 min read

Why Did Our HDFS Standby NameNode Crash? A Deep Dive into Block Recovery Bugs

IT Services Circle

Feb 9, 2025 · Big Data

Understanding HDFS: Architecture, Data Blocks, Fault Tolerance, and High Availability

This article explains how HDFS, the Hadoop Distributed File System, splits large files into blocks, replicates them for fault tolerance, organizes the cluster into NameNode and DataNode components, and provides high‑availability and scalability mechanisms such as standby NameNode and federation, enabling reliable big‑data storage and access.

Big DataDataNodeDistributed File System

0 likes · 11 min read

Understanding HDFS: Architecture, Data Blocks, Fault Tolerance, and High Availability

Bilibili Tech

Apr 26, 2024 · Big Data

Fine-Grained Lock Optimization for HDFS NameNode to Improve Metadata Read/Write Performance

To overcome the NameNode write bottleneck caused by a single global read/write lock in Bilibili’s massive HDFS deployment, the team introduced hierarchical fine‑grained locking—splitting the lock into Namespace, BlockPool, and per‑INode levels—which yielded up to three‑fold write throughput gains, a 90 % drop in RPC queue time, and shifted performance limits from lock contention to log synchronization.

Big DataHDFSMetadata

0 likes · 15 min read

Fine-Grained Lock Optimization for HDFS NameNode to Improve Metadata Read/Write Performance

Data Thinking Notes

Nov 29, 2022 · Big Data

Understanding HDFS High Availability: Roles, Metadata Persistence, and Failover

This article explains the core concepts of HDFS High Availability, detailing primary and standby NameNode roles, failover mechanisms, shared storage systems, metadata persistence via EditLog and FsImage, and the processes for merging and synchronizing data across active and standby nodes.

EditLogFsImageHDFS

0 likes · 8 min read

Understanding HDFS High Availability: Roles, Metadata Persistence, and Failover

Programmer DD

Apr 14, 2021 · Big Data

Understanding HDFS Architecture: Key Components, Protocols, and Limitations

This article explains HDFS’s master‑slave architecture, detailing the roles of NameNode and DataNode, namespace management, communication protocols, client functions, common configuration parameters, maintenance commands, and the inherent limitations of a single‑NameNode design.

Big DataConfigurationDataNode

0 likes · 5 min read

Understanding HDFS Architecture: Key Components, Protocols, and Limitations

Big Data Technology & Architecture

Dec 27, 2020 · Big Data

Understanding and Solving the Small File Problem in Big Data Systems

This article examines the pervasive small‑file issue in big‑data environments, explains its impact on storage and processing performance, and presents a comprehensive set of solutions—including file merging, Hadoop archives, SequenceFiles, HBase, CombineFileInputFormat, and Spark/Flink strategies—to mitigate metadata overhead and improve I/O efficiency.

FlinkHadoopNameNode

0 likes · 41 min read

Understanding and Solving the Small File Problem in Big Data Systems

Big Data Technology & Architecture

Jul 10, 2020 · Big Data

Understanding Namenode Metadata Persistence: FsImage, EditLog, and SecondaryNamenode

This article explains how Hadoop's Namenode persists metadata using FsImage and EditLog, describes the checkpoint process during startup, and details the role of SecondaryNamenode in merging these files for efficient recovery, while also encouraging readers to like and share the content.

EditLogFsImageHadoop

0 likes · 4 min read

Understanding Namenode Metadata Persistence: FsImage, EditLog, and SecondaryNamenode

Sohu Tech Products

Mar 4, 2020 · Big Data

Introduction to HDFS: Architecture, Components, and Operations

This article provides a comprehensive overview of HDFS, covering its role as a distributed file system, the concepts of blocks, NameNode and DataNode responsibilities, replication, edit logs, snapshots, high‑availability mechanisms, and practical considerations for managing large‑scale data storage.

DataNodeDistributed File SystemHDFS

0 likes · 11 min read

Introduction to HDFS: Architecture, Components, and Operations

DataFunTalk

Jan 2, 2020 · Big Data

ByteDance’s HDFS Architecture and Evolution: Design, Challenges, and Optimizations

This article presents an in‑depth overview of ByteDance’s large‑scale HDFS deployment, describing its unique access layer, metadata and data layers, the evolution through multiple growth stages, and the key architectural improvements such as NNProxy, DanceNN, lock redesign, startup acceleration, and slow‑node mitigation techniques.

Big DataByteDanceFederation

0 likes · 18 min read

ByteDance’s HDFS Architecture and Evolution: Design, Challenges, and Optimizations

dbaplus Community

Oct 28, 2019 · Big Data

Quickly Analyze Hadoop NameNode RPC with ELK and Grafana

This guide shows how to reduce excessive NameNode RPC calls caused by frequent HDFS directory listings and demonstrates a complete ELK pipeline—Filebeat, Kafka/Logstash, Elasticsearch, and Kibana—plus Grafana dashboards for real‑time monitoring of Hadoop RPC operations.

ELKGrafanaHadoop

0 likes · 9 min read

Quickly Analyze Hadoop NameNode RPC with ELK and Grafana

Beike Product & Technology

Jun 28, 2019 · Big Data

Hadoop NameNode Performance Bottlenecks and Solutions: Federation, ViewFS, FastCopy, Balance & Mover

This article analyzes the performance and stability bottlenecks of a Hadoop 2.7.3 NameNode caused by memory limits, RPC QPS, and long restart times, and presents a comprehensive solution stack—including HDFS federation, ViewFS, FastCopy, and tuned Balance/Mover tools—to improve scalability and reduce downtime.

BalanceFastCopyFederation

0 likes · 11 min read

Hadoop NameNode Performance Bottlenecks and Solutions: Federation, ViewFS, FastCopy, Balance & Mover

Architects' Tech Alliance

Mar 18, 2019 · Big Data

Understanding HDFS Architecture, NameNode HA, and Read/Write Processes

This article explains the concepts and architecture of HDFS, the high‑availability mechanisms of NameNode including quorum‑based shared storage, the detailed read and write workflows of the distributed file system, and discusses its typical use cases and limitations.

Big DataHAHDFS

0 likes · 16 min read

Understanding HDFS Architecture, NameNode HA, and Read/Write Processes

Meituan Technology Team

Mar 17, 2017 · Big Data

Optimizing Hadoop NameNode Restart in HA with QJM

By applying a series of JIRA patches and configuration tweaks—such as shrinking the fsLock scope, increasing checkpoint transaction thresholds, off‑loading quota calculations, simplifying BlockReport handling, and async processing of mis‑replicated blocks—the Hadoop HA NameNode restart time in a 540 MB metadata cluster drops from roughly 4000 seconds to about 2000 seconds, cutting total downtime to around 35 minutes and greatly improving cluster availability.

HAHDFSHadoop

0 likes · 18 min read

Optimizing Hadoop NameNode Restart in HA with QJM

Meituan Technology Team

Dec 9, 2016 · Big Data

Memory Usage Analysis of HDFS NameNode Core Data Structures

The article quantitatively breaks down HDFS NameNode memory consumption, showing that the Namespace tree and BlocksMap together dominate heap usage (≈53 GB in large clusters), provides detailed per‑object size estimates for NetworkTopology, INode and block structures, and proposes a simple formula to predict total heap requirements and tuning recommendations.

Big DataHDFSMemory Management

0 likes · 13 min read

Memory Usage Analysis of HDFS NameNode Core Data Structures

Meituan Technology Team

Aug 26, 2016 · Big Data

Memory Architecture and Analysis of Hadoop HDFS NameNode

The article dissects Hadoop 2.4.1’s HDFS NameNode memory architecture, detailing how the Namespace, BlockManager, NetworkTopology, and LeaseManager consume the heap, exposing scaling problems when metadata reaches hundreds of millions of inodes and blocks, and recommending file merging, block‑size tuning, federation, or external KV stores to mitigate heap pressure.

Big DataHDFSMemory Management

0 likes · 17 min read

Memory Architecture and Analysis of Hadoop HDFS NameNode

Qunar Tech Salon

May 13, 2016 · Big Data

Overview and Architecture of Hadoop Distributed File System (HDFS)

This article provides a comprehensive overview of Hadoop Distributed File System (HDFS), detailing its design goals, architecture components such as NameNode, DataNode and SecondaryNameNode, data block handling, replication strategies, communication protocols, and the read, write, and delete processes.

Big DataData ReplicationDistributed File System

0 likes · 18 min read

Overview and Architecture of Hadoop Distributed File System (HDFS)

ITPUB

Mar 19, 2016 · Big Data

Inside HDFS: How NameNode and DataNode Manage Big Data Writes and Reads

This article explains the fundamentals of distributed file systems, focusing on Hadoop’s HDFS architecture, the separation of metadata and data via NameNode and DataNode, and detailed step‑by‑step write and read processes, including replication, fault recovery, and block splitting across nodes.

Big DataDataNodeDistributed File System

0 likes · 8 min read

Inside HDFS: How NameNode and DataNode Manage Big Data Writes and Reads

Art of Distributed System Architecture Design

Nov 20, 2015 · Big Data

Design and Implementation of Alibaba Cloud's Cross‑Data‑Center Hadoop Cluster

In 2013 Alibaba Cloud faced full rack capacity in a single IDC, prompting the development of a multi‑NameNode, cross‑data‑center Hadoop solution that overcomes NameNode scalability, inter‑site bandwidth limits, data placement, job scheduling, massive data migration, and user transparency challenges.

CloudCross‑Data‑CenterDistributed storage

0 likes · 14 min read

Design and Implementation of Alibaba Cloud's Cross‑Data‑Center Hadoop Cluster

Art of Distributed System Architecture Design

Apr 24, 2015 · Big Data

Design Principles and Architecture of HDFS (Hadoop Distributed File System)

This article explains HDFS's design goals, master/slave architecture, namespace management, block replication strategies, fault tolerance mechanisms, metadata persistence, communication protocols, robustness features, data organization, access methods, and space reclamation, providing a comprehensive overview of Hadoop's distributed storage system.

DataNodeHDFSNameNode

0 likes · 20 min read

Design Principles and Architecture of HDFS (Hadoop Distributed File System)

MaGe Linux Operations

Nov 6, 2014 · Big Data

Understanding Hadoop 2.0 High Availability: NFS vs QJM Explained

This article explains Hadoop 2.0's High Availability architecture, comparing the NFS and Quorum Journal Manager methods, detailing their principles, failover mechanisms, and practical tips for reliable NameNode redundancy in big‑data deployments.

Big DataHadoopHigh Availability

0 likes · 7 min read

Understanding Hadoop 2.0 High Availability: NFS vs QJM Explained