Big Data 8 min read

Sqoop Tutorial: Importing and Exporting Data between Relational Databases, HDFS, Hive, and HBase

This article provides a comprehensive guide to using Sqoop for importing data from relational databases into HDFS, Hive, and HBase, as well as exporting data back to databases, covering command syntax, options, and practical examples for big‑data workflows.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Sqoop Tutorial: Importing and Exporting Data between Relational Databases, HDFS, Hive, and HBase

Sqoop is a tool that translates user‑written commands into MapReduce jobs to transfer data between relational databases and Hadoop components such as HDFS, Hive, and HBase.

Import to HDFS – examples of full‑table import, column‑specific import, and filtered import using --where or custom --query statements.

bin/sqoop import \
  --connect jdbc:mysql://hadoop102:3306/test \
  --username root \
  --password 123 \
  --table t_emp \
  --target-dir /sqoopTest \
  --delete-target-dir \
  --fields-terminated-by "\t" \
  --num-mappers 2 \
  --split-by id \
  --columns id,name,age

Import to Hive – use --hive-import and --hive-overwrite to load data directly into a Hive table, with notes on table creation and field delimiters.

bin/sqoop import \
  --connect jdbc:mysql://hadoop102:3306/test \
  --username root \
  --password 123 \
  --query 'select * from t_emp where id>3 and $CONDITIONS' \
  --target-dir /sqoopTest \
  --fields-terminated-by "\t" \
  --delete-target-dir \
  --hive-import \
  --hive-overwrite \
  --hive-table t_emp \
  --num-mappers 1 \
  --split-by id

Import to HBase – enable --hbase-create-table, specify --hbase-table, --hbase-row-key, and column family with --column-family. Compatibility issues may require manual table creation.

bin/sqoop import \
  --connect jdbc:mysql://hadoop102:3306/test \
  --username root \
  --password 123 \
  --query 'select * from t_emp where id>3 and $CONDITIONS' \
  --target-dir /sqoopTest \
  --delete-target-dir \
  --hbase-create-table \
  --hbase-table "t_emp" \
  --hbase-row-key "id" \
  --column-family "info" \
  --num-mappers 2 \
  --split-by id

Export from HDFS to a relational database – basic export command, handling empty vs. non‑empty target tables, and conflict resolution using --update-key with updateonly or allowinsert modes.

bin/sqoop export \
  --connect 'jdbc:mysql://hadoop103:3306/mydb?useUnicode=true&characterEncoding=utf-8' \
  --username root \
  --password 123456 \
  --table t_emp2 \
  --export-dir /hive/t_emp \
  --input-fields-terminated-by "\t" \
  --update-key id \
  --update-mode allowinsert

Additional tips include using --null-string and --null-non-string for proper NULL handling, compressing output with --compress, and troubleshooting steps such as checking MySQL binlogs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataHivemysqlHBaseHDFSdata importSqoop
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.