Big Data 25 min read

How We Rolled Out a Massive HDFS 2.6→3.1 Upgrade on a 10,000‑Node Cluster

This article details the end‑to‑end process of migrating a 10,000‑node offline data‑warehouse from CDH 5.14.4 (HDFS 2.6.0) to HDP 3.1.4 (HDFS 3.1.1), covering version selection, rolling‑upgrade strategy, incompatibility fixes, client handling, tool coexistence, testing, automation, and lessons learned.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
How We Rolled Out a Massive HDFS 2.6→3.1 Upgrade on a 10,000‑Node Cluster

Background

The offline data‑warehouse was built on CDH 5.14.4, which bundles Hadoop 2.6.0+cdh5.14.4+2785. The environment grew to ten HDFS clusters with ~10,000 nodes, exposing several pain points:

Frequent NameNode RPC performance degradation causing Hive/Spark job latency.

Complex patch management and service restarts increasing operational cost.

ViewFS‑based client configuration updates were time‑consuming.

HDFS 2.x lacked Erasure Coding, inflating storage costs for cold data.

Why Upgrade to HDFS 3.x

Hadoop 3.x adds Erasure Coding, multiple NameNodes, Router‑Based Federation, Standby NameNode reads, FairCallQueue, and an intra‑datanode balancer. These features improve stability, performance, and reduce storage cost.

Version Selection

Because the stack was CDH 5.14.4, the team evaluated a CDH upgrade but CDH 7 is commercial. The free Apache/Hortonworks distribution was chosen: Hortonworks HDP 3.1.4.0‑315 (Apache Hadoop 3.1.1) with Ambari for management.

Upgrade Strategy

HDFS provides two upgrade modes:

Express : stop services, upgrade, restart (downtime).

RollingUpgrade : upgrade nodes one by one without service interruption.

Given the critical NameNode RPC performance, the RollingUpgrade path was selected.

Rollback Options

Rollback : restores previous HDFS version **and** data – risky because it may cause data loss.

RollingDowngrade : only the software version is rolled back, preserving data.

The team chose RollingDowngrade to avoid data loss.

Client Upgrade Plan

Only server components (NameNode, JournalNode, DataNode) were upgraded initially. Client libraries (Hive, Spark, Flink, etc.) remained on HDFS 2.6.0 and were upgraded later after compatibility verification.

Rolling Upgrade Steps

Upgrade each JournalNode: stop, replace binaries, and restart.

Prepare the NameNode by generating a rollback fsimage file.

Restart the Standby NameNode and its ZKFC using the new Hadoop binaries.

Perform HA failover so the upgraded NameNode becomes Active.

Restart the second NameNode and its ZKFC.

Rolling restart of all DataNode instances with the new binaries.

Execute Finalize to confirm the cluster is now on the new version.

Coexistence of Management Tools

The original cluster used Cloudera Manager (CM) for HDFS, YARN, Hive, and HBase. After the upgrade, HDFS 3.x is managed by Ambari while the other components continue under CM. CM still manages the HDFS 2.x client side and Zookeeper.

Ambari and CM coexistence diagram
Ambari and CM coexistence diagram

Issues Encountered During Upgrade

Community‑Fixed Incompatibilities

HDFS‑13596: Standby NameNode failed to read EditLog after EC data structures were written.

HDFS‑14396: EditLog incompatibility when downgrading from 3.x to 2.x.

HDFS‑14831: StringTable changes broke downgrade compatibility.

Relevant URLs: https://issues.apache.org/jira/browse/HDFS-13596, https://issues.apache.org/jira/browse/HDFS-14396, https://issues.apache.org/jira/browse/HDFS-14831

JournalNode Unknown Protocol

During JournalNode upgrade the following exception appeared:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcNoSuchProtocolException): Unknown protocol: org.apache.hadoop.hdfs.qjournal.protocol.InterQJournalProtocol

The issue was addressed in HDFS‑14942, which downgraded the log level to DEBUG; the error does not affect the upgrade.

URL: https://issues.apache.org/jira/browse/HDFS-14942

DatanodeProtocol.proto Incompatibility

Parameter ordering changed between HDFS 2.6.0 and 3.1.1, causing BlockReport failures. The root cause was commit HDFS‑9788, which added compatibility code that 2.6.0 lacked. The team reverted this change to keep client compatibility.

URL: https://issues.apache.org/jira/browse/HDFS-9788

NameNode LayoutVersion Change

Upgrading introduced new features (TRUNCATE, APPEND_NEW_BLOCK, QUOTA_BY_STORAGE_TYPE, ERASURE_CODING) that increased the layout version from -60 to -64, breaking downgrade. By modifying minCompatLV to -60 in the source, downgrade to 2.6.0 became possible.

DataNode LayoutVersion Change

HDFS‑8791 changed the DataNode directory layout from 256×256 to 32×32, raising the layout version from -56 to -57. The upgrade time for large DataNodes became unacceptable (5 minutes). The team reverted the patch, reducing upgrade time to under 3 minutes for nodes with 1‑2 million blocks.

DataNode layout upgrade flow
DataNode layout upgrade flow

DataNode Trash Handling

During upgrade, deleted blocks are moved to a trash directory to allow rollback. Because the cluster’s disk usage was already ~80 %, the team scripted periodic cleanup of the trash to avoid storage pressure.

Other Fixes

HDFS‑13671 reverted a FoldedTreeSet change that caused RPC slowdown in NameNode, restoring the lightweight set implementation and reducing stale DataNode counts.

URL: https://issues.apache.org/jira/browse/HDFS-13671

Testing and Production Rollout

From March 2021 to January 2022 the team performed multiple test‑cluster upgrades, full compatibility tests with Hive, Spark, Kylin, Presto, Druid, and automated the upgrade steps via Python scripts using CM and Ambari APIs. Key milestones:

Mar‑Apr 2021: feature analysis, source code review, patch integration.

May‑Aug 2021: upgrade‑downgrade rehearsals and full compatibility validation.

Sep 2021: upgrade of a 100‑node log‑aggregation cluster without service impact.

Nov 2021: upgrade of seven offline‑warehouse clusters (~5 000 nodes) with zero user impact.

Jan 2022: final upgrade of ten clusters (~10 000 nodes) completed successfully.

Post‑upgrade monitoring confirmed stable HDFS service across all clusters.

Conclusion

The year‑long effort migrated a massive 10 000‑node HDFS cluster from CDH 2.6.0 to HDP 3.1.1, switched management from CM to Ambari, and laid the groundwork for future YARN, Hive/Spark, and HBase upgrades. The documented process serves as a reference for similar large‑scale migrations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataData WarehouseHDFSRolling UpgradeCluster MigrationHadoop
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.