Big Data 5 min read

Step-by-Step Guide to Install and Configure a Hadoop 2.8.2 Cluster

This tutorial provides a complete walkthrough for downloading Hadoop 2.8.2, setting up a three‑node master‑slave cluster, configuring core, HDFS, MapReduce and YARN settings, creating required directories, distributing the installation, starting the services, verifying the cluster status, and finally shutting it down.

Practical DevOps Architecture

Nov 27, 2020

Step-by-Step Guide to Install and Configure a Hadoop 2.8.2 Cluster

This guide shows how to download the Hadoop 2.8.2 source package from http://mirror.bit.edu.cn/apache/hadoop/common, extract it, and prepare a three‑node cluster consisting of a master (172.16.11.97) and two slaves (172.16.11.98, 172.16.11.99).

On the master node, the package is obtained with:

wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.8.2/hadoop-2.8.2.tar.gz
tar zxvf hadoop-2.8.2.tar.gz

Configuration files under hadoop-2.8.2/etc/hadoop are edited:

cd hadoop-2.8.2/etc/hadoop
vim hadoop-env.sh   # set JAVA_HOME
export JAVA_HOME=/usr/local/src/jdk1.8.0_152

vim yarn-env.sh   # set JAVA_HOME similarly

vim slaves
slave1
slave2

Key XML files are modified (shown here as code snippets):

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://172.16.11.97:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/usr/local/src/hadoop-2.8.2/tmp</value>
  </property>
</configuration>

<configuration>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/usr/local/src/hadoop-2.8.2/dfs/name</value>
  </property>
  ...
</configuration>

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  ...
</configuration>

Required directories are created:

mkdir /usr/local/src/hadoop-2.8.2/tmp
mkdir -p /usr/local/src/hadoop-2.8.2/dfs/name
mkdir -p /usr/local/src/hadoop-2.8.2/dfs/data

Environment variables are added to ~/.bashrc on all nodes:

HADOOP_HOME=/usr/local/src/hadoop-2.8.2
export PATH=$PATH:$HADOOP_HOME/bin

The Hadoop directory is copied to the slave machines:

scp -r /usr/local/src/hadoop-2.8.2 root@slave1:/usr/local/src/hadoop-2.8.2
scp -r /usr/local/src/hadoop-2.8.2 root@slave2:/usr/local/src/hadoop-2.8.2

The namenode is formatted and the whole cluster started:

hadoop namenode -format
./sbin/start-all.sh

Cluster status can be checked with jps on each node and the YARN web UI is reachable at http://master:8088. Common Hadoop commands are the same as in version 1.0, and the cluster can be stopped with:

./sbin/hadoop stop-all.sh

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Linux YARN HDFS Hadoop Cluster Setup

Written by

Practical DevOps Architecture

Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.