Big Data 6 min read

Why Did Multiple HDFS DataNodes Crash? Memory, GC, and Block Overload Explained

This article analyzes a midnight HDFS DataNode failure caused by excessive GC and OOM due to Spark batch jobs, examines how an unexpected surge in block count overloaded default memory settings, and presents concrete remediation steps and optimization recommendations to stabilize the cluster.

Data Thinking Notes

Dec 6, 2022

Why Did Multiple HDFS DataNodes Crash? Memory, GC, and Block Overload Explained

Problem Description

In the early hours, a DataNode (hadoop24) in an HDFS cluster went down, and four other DataNodes (hadoop1, hadoop5, hadoop21, hadoop22) appeared to be in a dead‑lock state: their processes were running but they failed to send heartbeats to the NameNode, which marked them offline.

Investigation Process

2.1 Direct Causes

(1) Log inspection of the failed DataNode showed frequent GC and OOM events just before the crash.

The initial hypothesis was that a Spark batch job running at night consumed excessive memory, affecting the DataNode. After restarting the DataNode around 10 am when resources were less constrained, the service recovered.

(2) About an hour after restart, the DataNode crashed again, this time without any Spark job running, indicating the issue stemmed from the DataNode’s own memory limits rather than resource sharing.

Further inspection of the HDFS Overview showed the total block count had surged to over 5.7 million (previous day 2 million), with a single DataNode holding more than 900 k blocks. This exceeded the default 1 GB memory allocation per DataNode, leading to the crashes.

Resolution: Increase NameNode and DataNode memory to 3 GB and reduce Spark Worker memory allocation on the same nodes, then restart the cluster.

2.2 Cause of Block Count Explosion

Comparing the original table with the cleaned‑data table for the same date, the block count jumped from 68 to 4 736. The Spark cleaning job read eight source tables and wrote each result into a single output table, creating many small files.

Solution: Union the eight RDDs, then perform a repartition(32) before writing to HDFS, which normalized the block count per partition.

After this change, the total block count returned to normal levels.

Optimization Recommendations

(1) Add monitoring metrics for HDFS block count.

(2) Align NameNode JVM parameters with the expected number of filesystem objects (files + blocks):

File object count 10,000,000: -Xms6G -Xmx6G -XX:NewSize=512M -XX:MaxNewSize=512M File object count 20,000,000: -Xms12G -Xmx12G -XX:NewSize=1G -XX:MaxNewSize=1G File object count 50,000,000: -Xms32G -Xmx32G -XX:NewSize=3G -XX:MaxNewSize=3G File object count 100,000,000: -Xms64G -Xmx64G -XX:NewSize=6G -XX:MaxNewSize=6G (3) Recommended JVM settings for a DataNode based on its average block count:

2,000,000 blocks: -Xms6G -Xmx6G -XX:NewSize=512M -XX:MaxNewSize=512M 5,000,000 blocks: -Xms12G -Xmx12G -XX:NewSize=1G -XX:MaxNewSize=1G (4) When multiple tables write to the same result table in Spark, monitor the resulting block count and apply repartition if necessary before persisting.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Management Garbage Collection HDFS Spark DataNode Block Overload

Written by

Data Thinking Notes

Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.