Step-by-Step Guide to Installing and Configuring Hadoop 2.9.2 Cluster on Three Nodes
This article provides a detailed, step-by-step tutorial for installing Hadoop 2.9.2, configuring environment variables, editing XML configuration files, formatting the NameNode, starting HDFS and YARN services, testing the cluster, and setting up the MapReduce history server on a three‑node Linux environment.
Prepare three Linux servers (bigdata11, bigdata12, bigdata13) with Hadoop 2.9.2 and JDK 1.8. Assign bigdata11 as the NameNode, all three as DataNodes, and configure a secondary NameNode as needed.
Extract the Hadoop package to the desired directory:
tar -zxvf hadoop-2.9.2.tar.gz -C ../training/Set environment variables by editing ~/.bash_profile and adding:
export HADOOP_HOME=/root/training/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbinNavigate to hadoop-2.9.2/etc/hadoop and edit the following files:
hadoop-env.sh : set JAVA_HOME .
core-site.xml : add the default filesystem property: <property> <name>fs.defaultFS</name> <value>hdfs://bigdata11:9000</value> </property>
hdfs-site.xml : configure temporary directory, secondary NameNode address, and replication factor: <property> <name>hadoop.dir.tmp</name> <value>/root/training/hadoop-2.9.2/tmp</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>bigdata13:50090</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property>
slaves : list the hostnames or IPs of the DataNode servers.
mapred-env.sh and yarn-env.sh : set JAVA_HOME .
mapred-site.xml : define the framework as YARN: <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
yarn-site.xml : set the ResourceManager hostname and auxiliary services: <property> <name>yarn.resourcemanager.hostname</name> <value>bigdata13</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
Ensure the Hadoop installation directory belongs to the root group:
chown -R root:root /path/to/hadoop-2.9.2Copy the Hadoop package to the secondary nodes (bigdata12, bigdata13).
Format the NameNode on the primary server:
hadoop namenode -formatStart the cluster in single‑node mode by running the following commands from the sbin directory:
Start HDFS on each node: hadoop-daemon.sh start namenode # on bigdata11 hadoop-daemon.sh start datanode # on each node
Start YARN services: yarn-daemon.sh start resourcemanager # on bigdata13 yarn-daemon.sh start nodemanager # on bigdata11 and bigdata12
Web UI addresses: HDFS – http:// :50070 , YARN – http:// :8088 .
For full cluster startup, execute from the NameNode host:
start-dfs.sh
start-yarn.shTest the HDFS functionality by creating a directory, uploading a file, and retrieving it:
hdfs dfs -mkdir -p /test/input
hdfs dfs -put /root/test.txt /test/input
hdfs dfs -get /test/input/test.txt /root/Run a MapReduce word‑count example:
hdfs dfs -mkdir /wcinput
hdfs dfs -put /root/wc.txt /wcinput
hadoop jar hadoop-mapreduce-examples-2.9.2.jar wordcount /wcinput /wcoutput
hdfs dfs -cat /wcoutput/part-r-00000Configure the MapReduce History Server by adding to mapreduce-site.xml :
<property>
<name>mapreduce.jobhistory.address</name>
<value>bigdata11:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>bigdata11:19888</value>
</property>Start the history server on the configured node:
mr-jobhistory-daemon.sh start historyserverEnable log aggregation by adding to yarn-site.xml :
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>With these steps completed, the Hadoop 2.9.2 cluster is fully installed, configured, and ready for production workloads.
Practical DevOps Architecture
Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.