Essential HDFS Shell Commands for Managing Hadoop Files
This guide explains how to use the HDFS shell (preferred via hdfs dfs) to list, copy, move, delete, and snapshot files in a Hadoop cluster, detailing command syntax, URI handling, generic options, and practical examples for each operation.
After setting up a Hadoop cluster on CentOS 7, you can interact with HDFS using the HDFS shell. Two access methods exist: the HDFS shell (recommended via hdfs dfs) and the Java API. Shell commands accept a URI path of the form scheme://path; hdfs is the scheme for HDFS and file for the local filesystem. The scheme is optional and defaults to HDFS.
Command Syntax
The general syntax is: command [genericOptions] [commandOptions] Common generic options include: -conf <configuration file> – specify a configuration file -D <property=value> – set a property value -fs <file://…|hdfs://namenode:port> – override the default filesystem URL -jt <local|resourcemanager:port> – specify a ResourceManager -files <comma‑separated list of files> – copy files to the cluster -libjars <comma‑separated list of jars> – add jars to the classpath -archives <comma‑separated list of archives> – unarchive on compute nodes
Common HDFS Shell Commands
Usage : hdfs dfs -usage ls – shows usage for a command.
Help : hdfs dfs -help ls – displays detailed help.
ls : hdfs dfs -ls hdfs://master:9000/ – lists files/directories; / can be used as a shortcut for the root.
put : hdfs dfs -put /usr/local/1.txt / – uploads a local file to HDFS.
cat : hdfs dfs -cat /1/txt – prints file content.
text : hdfs dfs -text /1.txt – outputs a text file (equivalent to cat for plain text).
tail : hdfs dfs -tail /1.txt – shows the last 1 KB of a file.
touchz : hdfs dfs -touchz /badao.txt – creates an empty file.
get : hdfs dfs -get /badao.txt – downloads a file to the local filesystem.
copyFromLocal : same as put.
copyToLocal : same as get.
moveFromLocal : uploads and deletes the local source after success.
mv : hdfs dfs -mv /1.txt /user – moves or renames a file, similar to Linux mv.
cp : hdfs dfs -cp /badao.txt /badaocopy.txt – copies a file.
mkdir : hdfs dfs -mkdir /newdir – creates a directory.
rm : hdfs dfs -rm /badaocopy.txt – deletes a file; -r enables recursive deletion of non‑empty directories.
rmdir : hdfs dfs -rmdir /newdir – removes an empty directory.
expunge : hdfs dfs -expunge – empties the trash.
chmod : hdfs dfs -chmod 777 /badao.txt – changes permissions.
count : hdfs dfs -count -q / – shows file or directory counts.
du : hdfs dfs -du / – displays file sizes; with a directory, shows sizes of each file.
df : hdfs dfs -df / – checks filesystem disk usage.
stat : hdfs dfs -stat %b,%g,%n,%o,%r,%u,%y /user – prints file statistics (block count, group, name, block size, replication, owner, modification time).
createSnapshot : hdfs dfs -createSnapshot /user snap1 – creates a snapshot; does not copy data blocks, but stores metadata in NameNode memory.
renameSnapshot : hdfs dfs -renameSnapshot /user snap1 snap2 – renames a snapshot.
deleteSnapshot : hdfs dfs -deleteSnapshot /user snap2 – deletes a snapshot.
Before creating a snapshot, the directory must be made snapshottable, e.g. hdfs dfsadmin -allowSnapshot /user. If the directory is not snapshottable, the command returns “Directory is not a snapshottable directory”.
For reference, the original tutorial can be found at the linked CSDN blog.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
The Dominant Programmer
Resources and tutorials for programmers' advanced learning journey. Advanced tracks in Java, Python, and C#. Blog: https://blog.csdn.net/badao_liumang_qizhi
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
