Big Data 7 min read

How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance

This article explains how HDFS, inspired by Google’s GFS, provides a low‑cost, highly reliable, fault‑tolerant, and high‑performance distributed file system for big‑data workloads by using replication, standby NameNodes, block storage, rack awareness, and compute‑close‑to‑data strategies.

MaGe Linux Operations

Nov 7, 2016

How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance

Introduction

Derived from Google’s GFS (Google File System) paper, HDFS (Hadoop Distributed File System) is a clone of GFS that enables multiple machines to share storage space, allowing users to access files over the network as if they were on a local disk.

Key Features

HDFS offers a low‑cost, high‑reliability, high‑fault‑tolerance, and high‑performance distributed file system.

Low Cost : Achieved by scaling horizontally with many inexpensive machines rather than purchasing expensive servers.

High Reliability :

Primary‑Standby NameNode architecture ensures that if the primary NameNode fails, the standby automatically takes over.

Replication mechanism stores multiple copies of each block on different machines; if a node fails, its data is replicated to other nodes.

Load balancing via the balancer tool moves data to reduce hotspot pressure.

Rack awareness places replicas across different racks to improve fault tolerance.

High Fault Tolerance : MapReduce tasks are monitored by JobTracker; if a TaskTracker fails, the job is reassigned to another node.

High Performance : Parallel processing of large tasks across many machines dramatically speeds up computation compared to single‑node serial execution.

Architecture

An HDFS cluster consists of a central NameNode and multiple DataNodes. The NameNode manages metadata and coordinates the cluster; DataNodes store the actual data blocks.

Data Storage Model

Files are split into fixed‑size blocks (configurable during setup). Large files are divided into multiple blocks, which are stored on different DataNodes based on load and rack awareness. Metadata and actual data are stored separately; the NameNode holds metadata while DataNodes hold the block data.

Write‑Once, Read‑Many Model

HDFS files are write‑once; after a file is closed it cannot be modified, only appended. This model ensures data consistency and simplifies concurrency control, providing high throughput for read‑heavy workloads.

Moving Compute vs. Moving Data

In distributed systems, bringing computation close to the data (e.g., MapReduce, Spark) is more efficient than transferring large datasets across the network. DataNodes perform computation on local blocks and return results to the NameNode for aggregation, reducing network traffic and improving performance.

Conclusion

HDFS, as the core storage component of Hadoop, delivers a robust, cost‑effective solution for big‑data storage, leveraging replication, standby NameNodes, block distribution, and compute‑close‑to‑data principles to ensure reliability and performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data fault tolerance Data Replication Distributed File System HDFS

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.