Big Data 32 min read

Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide

This article provides a comprehensive guide to optimizing Hadoop HDFS storage through erasure coding and heterogeneous storage policies, explains fault‑tolerance techniques such as safe mode and slow‑disk monitoring, and shares practical MapReduce performance tuning and enterprise‑level configuration examples for large‑scale clusters.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide

5 HDFS – Storage Optimization

Erasure coding reduces the default three‑replica redundancy by up to 50% using mathematical encoding; several policies (RS‑3‑2‑1024k, RS‑10‑4‑1024k, RS‑6‑3‑1024k, XOR‑2‑1‑1024k) are described with their block composition.

5.1 Erasure Coding

5.1.1 Principle

HDFS normally stores three replicas of each block. Hadoop 3.x introduces erasure coding, which can halve storage consumption.

5.1.2 Commands

[Tom@hadoop102 hadoop-3.1.3]$ hdfs ec
Usage: bin/hdfs ec [COMMAND]
          [-listPolicies]
          [-addPolicies -policyFile <file>]
          [-getPolicy -path <path>]
          [-removePolicy -policy <policy>]
          [-setPolicy -path <path> [-policy <policy>] [-replicate]]
          [-unsetPolicy -path <path>]
          [-listCodecs]
          [-enablePolicy -policy <policy>]
          [-disablePolicy -policy <policy>]
          [-help <command-name>]

List supported policies:

[Tom@hadoop102 hadoop-3.1.3]$ hdfs ec -listPolicies
Erasure Coding Policies:
ErasureCodingPolicy=[Name=RS-10-4-1024k, ...], State=DISABLED
ErasureCodingPolicy=[Name=RS-3-2-1024k, ...], State=DISABLED
ErasureCodingPolicy=[Name=RS-6-3-1024k, ...], State=ENABLED
ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, ...], State=DISABLED
ErasureCodingPolicy=[Name=XOR-2-1-1024k, ...], State=DISABLED

Example: enable RS‑3‑2‑1024k and apply it to /input directory.

[Tom@hadoop102 hadoop-3.1.3]$ hdfs ec -enablePolicy -policy RS-3-2-1024k
Erasure coding policy RS-3-2-1024k is enabled
[Tom@hadoop102 hadoop-3.1.3]$ hdfs dfs -mkdir /input
[Tom@hadoop102 hadoop-3.1.3]$ hdfs ec -setPolicy -path /input -policy RS-3-2-1024k
Set RS-3-2-1024k erasure coding policy on /input

5.2 Heterogeneous Storage (Hot‑Cold Data Separation)

Different storage types (RAM_DISK, SSD, DISK, ARCHIVE) can be assigned to blocks to balance performance and cost.

5.2.1 Storage Policy Commands

[Tom@hadoop102 ~]$ hdfs storagepolicies -listPolicies
Block Storage Policies:
    BlockStoragePolicy{PROVIDED:1, storageTypes=[PROVIDED, DISK]}
    BlockStoragePolicy{COLD:2, storageTypes=[ARCHIVE]}
    BlockStoragePolicy{WARM:5, storageTypes=[DISK, ARCHIVE]}
    BlockStoragePolicy{HOT:7, storageTypes=[DISK]}
    BlockStoragePolicy{ONE_SSD:10, storageTypes=[SSD, DISK]}
    BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD]}
    BlockStoragePolicy{LAZY_PERSIST:15, storageTypes=[RAM_DISK, DISK]}

Set, get, unset policies and view block locations with hdfs fsck and hadoop dfsadmin -report.

5.2.2 Test Environment

A five‑node cluster is prepared with replication factor 2 and storage‑type directories defined in each node’s hdfs-site.xml (e.g., SSD, RAM_DISK, ARCHIVE).

<property>
    <name>dfs.replication</name>
    <value>2</value>
</property>
<property>
    <name>dfs.storage.policy.enabled</name>
    <value>true</value>
</property>
<property>
    <name>dfs.datanode.data.dir</name>
    <value>[SSD]file:///opt/module/hadoop-3.1.3/hdfsdata/ssd,[RAM_DISK]file:///opt/module/hadoop-3.1.3/hdfsdata/ram_disk</value>
</property>

After starting the cluster, various policies (HOT, WARM, COLD, ONE_SSD, ALL_SSD, LAZY_PERSIST) are applied to /hdfsdata and verified with hdfs fsck -files -blocks -locations.

6 HDFS – Fault Diagnosis

6.1 Safe Mode

Safe mode makes the filesystem read‑only. Commands:

bin/hdfs dfsadmin -safemode get   # view status
bin/hdfs dfsadmin -safemode enter # enter safe mode
bin/hdfs dfsadmin -safemode leave # exit safe mode
bin/hdfs dfsadmin -safemode wait  # wait until safe mode ends

6.2 Slow Disk Monitoring

Identify disks with high write latency by checking DataNode‑NameNode heartbeat intervals (>3 s) and using fio for read/write benchmarks.

6.3 Small‑File Archiving

HDFS stores metadata for each file (~150 bytes). Excess small files exhaust NameNode memory. Use Hadoop Archive (HAR) to pack many small files into a single block.

[Tom@hadoop102 hadoop-3.1.3]$ hadoop archive -archiveName input.har -p /input /output
[Tom@hadoop102 hadoop-3.1.3]$ hadoop fs -ls har:///output/input.har

7 MapReduce Production Experience

Common causes of slow jobs: hardware limits, data skew, and excessive small files. Tuning parameters include combiner usage, map‑side joins, and increasing the number of reducers.

8 Hadoop Comprehensive Tuning

8.1 Small‑File Optimization

Solutions: merge files at ingestion, use HAR, apply CombineTextInputFormat, enable uber mode to reuse JVMs.

<property>
    <name>mapreduce.job.ubertask.enable</name>
    <value>true</value>
</property>
<property>
    <name>mapreduce.job.ubertask.maxmaps</name>
    <value>9</value>
</property>

8.2 MapReduce Performance Testing

Generate random data with RandomWriter, sort it with the Sort example, and verify correctness using the test harness.

8.3 Enterprise Scenario

For a 1 GB word‑count job on a three‑node cluster (4 GB RAM, 4 CPU each), configuration adjustments include:

Increase NameNode handler count ( dfs.namenode.handler.count=21)

Set HDFS replication to 2 and enable storage policies

Adjust MapReduce memory, CPU, and spill parameters in mapred-site.xml Configure YARN resources (memory‑mb, cpu‑vcores, min/max allocation) in yarn-site.xml After applying the settings and restarting the cluster, the WordCount job runs efficiently, as shown on the YARN UI.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

MapReduceerasure codingHDFSHadoopCluster TuningStorage Policies
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.