Step-by-Step Guide to Building a Hadoop Cluster on CentOS 6.5
This article provides a comprehensive, hands‑on tutorial for setting up a Hadoop 2.6.4 cluster on a CentOS 6.5 development server, covering SSH password‑less login, user/group creation, DNS configuration, JDK installation, environment variables, Hadoop installation, HDFS and YARN configuration, and troubleshooting native library warnings.
SSH Password‑less Configuration
Generate an SSH key pair on the local machine (if not already created) and copy the public key to the server using ssh-copy-id [email protected]. After entering the password once, subsequent SSH and SCP commands can be run without a password.
New User and Group Creation
Create a dedicated group and user for Hadoop administration:
groupadd dps-hadoop
useradd -d /home/dps-hadoop -g dps-hadoop dps-hadoopAdd the user to the sudoers file with the line dps-hadoop ALL=(ALL) ALL to allow privileged commands.
Local DNS Configuration
Modify /etc/resolv.conf for temporary DNS settings and edit /etc/sysconfig/network-scripts/ifcfg-eth0 (replace eth0 with the actual interface) to set a permanent DNS server, e.g., DNS1=172.20.2.24.
JDK Installation
Download the Oracle JDK RPM (e.g., jdk-8u77-linux-x64.rpm) via scp or wget, rename if necessary, and install with rpm -i jdk-8u77-linux-x64.rpm.
Configure JAVA_HOME
Edit the Hadoop user’s ~/.bashrc and add: export JAVA_HOME="/usr/java/jdk1.8.0_77" Reload the file or start a new session to apply.
Install Hadoop 2.6.4
Download the tarball:
wget http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.6.4/hadoop-2.6.4.tar.gzExtract it to the home directory and set HADOOP_HOME in ~/.bashrc:
export HADOOP_HOME="/home/dps-hadoop/hadoop-2.6.4"Repeat these steps on all slave nodes.
HDFS Configuration
Configure the NameNode in core-site.xml and hdfs-site.xml:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/dps-hadoop/tmpdata</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54000/</value>
</property>
</configuration>In hdfs-site.xml set the NameNode and DataNode directories and replication factor:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/dps-hadoop/namedata</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>Similarly configure the DataNode’s core-site.xml and hdfs-site.xml with appropriate directories.
Start HDFS
Format the NameNode and start the HDFS daemons:
bin/hdfs namenode -format
sbin/start-dfs.shAccess the HDFS Web UI at http://172.20.2.14:50070/ to verify the cluster status.
Native Library Warning and Troubleshooting
If a warning about the native Hadoop library appears, ensure the correct java.library.path is set, the native library matches the system architecture, and the required GLIBC version is available. Adjust log4j.properties to increase debug level if needed.
YARN Configuration
Set the ResourceManager hostname in yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
</configuration>Start YARN with .sbin/start-yarn.sh. The MapReduce JobHistoryServer can be started or stopped as needed.
Cluster Management Web UI
Default ports for monitoring:
HDFS: http://master:50070/ ResourceManager: http://master:8088/ JobHistory:
http://master:19888/Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
