Deploying Hadoop KMS for Transparent HDFS Encryption: A Step‑by‑Step Guide
This article details a complete, hands‑on deployment of Hadoop KMS on a CentOS‑based Hadoop 2.6.1 cluster, covering environment setup, configuration file changes, key generation, service startup, encryption‑zone creation, user permission tuning, verification procedures, and common troubleshooting tips.
This guide documents the author’s practical deployment of Hadoop KMS to achieve transparent encryption for HDFS files in a Hadoop 2.6.1 cluster running on CentOS 6.5 with JDK 1.8.0_92.
KMS Overview : Hadoop KMS is a key‑management service that provides a REST API for clients (KeyProvider) to obtain encryption keys. It supports simple authentication (used in this guide) as well as Kerberos.
Environment :
Software
Version
Hadoop
2.6.1
JDK
1.8.0_92
OS
CentOS release 6.5 (Final)
Hadoop Superuser
hadp
Server List (NameNode, KMS, DataNode hosts) is provided in a table; the KMS service runs only on BJ-PRESTO-TEST-100080.lvxin.com .
Configuration Changes affect five XML files and three shell scripts:
core-site.xml – add <property>\n <name>hadoop.security.key.provider.path</name>\n <value>kms://[email protected]:16000/kms</value>\n</property> on all NameNode and DataNode nodes.
hdfs-site.xml – add <property>\n <name>dfs.encryption.key.provider.uri</name>\n <value>kms://[email protected]:16000/kms</value>\n</property> and restart HDFS.
kms-site.xml – configure hadoop.kms.key.provider.uri , hadoop.security.keystore.java-keystore-provider.password-file , and set hadoop.kms.authentication.type to simple .
kms-env.sh – set environment variables: export KMS_HOME=${HADOOP_HOME}\nexport KMS_LOG=${KMS_HOME}/logs/kms\nexport KMS_HTTP_PORT=16000\nexport KMS_ADMIN_PORT=16001
kms-acls.xml – map keys to users, e.g., <property>\n <name>key.acl.user_a_key.DECRYPT_EEK</name>\n <value>user_a</value>\n</property> and similarly for user_b_key .
Key Generation uses keytool to create user_a_key and user_b_key with password 123456 , then stores the password in kms.keystore.password :
echo 123456 > ${HADOOP_HOME}/share/hadoop/kms/tomcat/webapps/kms/WEB-INF/classes/kms.keystore.passwordService Startup :
[hadp@BJ-PRESTO-TEST-100080 ~]$ start-dfs.sh [hadp@BJ-PRESTO-TEST-100080 ~]$ kms.sh startBootstrap logs confirm the KMS Java process is running on ports 16000/16001.
Encryption Zone Creation :
[hadp@BJ-PRESTO-TEST-100080 ~]$ hadoop key create user_a_key [hadp@BJ-PRESTO-TEST-100080 ~]$ hdfs dfs -mkdir /user_a [hadp@BJ-PRESTO-TEST-100080 ~]$ hdfs dfs -chown user_a:test_group /user_a [hadp@BJ-PRESTO-TEST-100080 ~]$ hdfs crypto -createZone -keyName user_a_key -path /user_aRepeat for /user_b with user_b_key .
User Permission Configuration adds users to hadoop-policy.xml ( security.client.protocol.acl ) and refreshes ACLs on both NameNodes:
[hadp@BJ-PRESTO-TEST-100080 ~]$ hdfs dfsadmin -refreshServiceAclClient nodes set environment variables in ~/.bashrc to point to Hadoop binaries.
Verification :
Upload test.txt as user_a and user_b – succeeds.
Upload as user_c – fails with permission denied.
Read encrypted file as its owner – succeeds; reading another user’s encrypted file yields DECRYPT_EEK ACL error.
Inspect raw file under /.reserved/raw – encrypted content appears as binary, confirming transparent encryption.
Common Issues & Solutions :
Access denied for user … Superuser privilege required – add users to the supergroup on both NameNodes.
can’t be moved from an encryption zone – delete with -skipTrash flag.
Conclusion : By deploying Hadoop KMS with simple authentication, configuring the necessary XML files, creating keys and encryption zones, and adjusting user ACLs, transparent encryption of HDFS data is achieved and verified.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.