Big Data 8 min read

Step-by-Step Guide to Installing Hive 2.1.0 on a Hadoop 2.7.1 Cluster (Ubuntu 14.04)

This tutorial provides a comprehensive, step-by-step procedure for setting up Hive 2.1.0 on a Hadoop 2.7.1 cluster running Ubuntu 14.04, covering environment preparation, Hive installation, configuration of environment variables, MySQL metastore integration, client setup, service startup, and basic verification commands.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Step-by-Step Guide to Installing Hive 2.1.0 on a Hadoop 2.7.1 Cluster (Ubuntu 14.04)

Environment : Hadoop 2.7.1 on Ubuntu 14.04, Hive 2.0.1, a master node (namenode) running the Hive server, and two slave nodes (datanodes) as Hive clients. MySQL server for the Hive metastore is located at 101.201.81.34.

1. Install Hive on the master :

Download the binary package from Apache and extract it:

sudo tar -zxvf apache-hive-2.1.0-bin.tar.gz
sudo cp -R apache-hive-2.1.0-bin /home/cms/hive
chmod -R 775 /home/cms/hive
sudo chown -R cms /home/cms/hive

2. Set environment variables by adding the following lines to /etc/profile (and then executing source /etc/profile):

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=$HOME/hadoop-2.7.1
export HIVE_HOME=$HOME/hive
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$HIVE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PATH
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

3. Configure Hive :

Copy template files: cp hive-env.sh.template hive-env.sh and cp hive-default.xml.template hive-site.xml.

Edit hive-env.sh to set HADOOP_HOME correctly.

Modify hive-site.xml to point to the MySQL metastore:

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://101.201.81.34:3306/hive?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>admin</value>
  <description>password to use against metastore database</description>
</property>

4. Set Hive local scratch directories (required to avoid errors):

<property>
  <name>hive.exec.local.scratchdir</name>
  <value>/opt/hivetmp</value>
  <description>Local scratch space for Hive jobs</description>
</property>
<property>
  <name>hive.downloaded.resources.dir</name>
  <value>/opt/hivetmp</value>
  <description>Temporary local directory for added resources</description>
</property>

Create the directory and set permissions:

mkdir -p /opt/hivetmp
chmod -R 775 /opt/hivetmp

5. Install MySQL connector :

Download mysql-connector-java-5.1.30-bin.jar from the MySQL website and place it in $HIVE_HOME/lib.

6. Set up Hive client on slave2 :

scp -r hive slave2:/home/cms
sudo ufw disable

Adjust hive-site.xml on the slave to use the remote metastore:

<property>
  <name>hive.metastore.uris</name>
  <value>thrift://master:9083</value>
  <description>Thrift uri for the remote metastore</description>
</property>
<property>
  <name>hive.metastore.warehouse.dir</name>
  <value>hdfs://hive/warehouse</value>
</property>

Synchronize /etc/profile with the master.

7. Start services :

Initialize the metastore schema on the master: schematool -initSchema -dbType mysql Start the Hive metastore service: hive --service metastore & Verify processes with jps (you should see a RunJar process for Hive).

8. Test Hive :

hive
show databases;
show tables;

List the warehouse directory in HDFS to confirm table storage: dfs -ls /user/hive/warehouse The guide concludes with a reminder that likes and shares are appreciated.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataHivemysqlInstallationHadoopUbuntuMetaStore
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.