How Hulu Upgraded Hadoop 2.6 to 3.0: Lessons in Compatibility and Migration
This article details Hulu's five‑year journey from Hadoop 2.6 to 3.3.2, covering major feature evolutions, the original cluster architecture, a comprehensive upgrade plan, compatibility challenges across HDFS, YARN, Hive, Spark and Flink, and the testing and rollout strategies that ensured a smooth migration.
Background
Hadoop 3 was released five years ago and has since evolved to version 3.3.2, introducing features such as mature HDFS erasure coding, simplified HDFS RBF client configuration, multi‑standby NameNodes, Docker support in YARN, dynamic resource‑allocation APIs, and improved federation.
Original Cluster Architecture
Before the upgrade, Hulu's Hadoop cluster ran on CDH5.7.3 with Hadoop 2.6.0, comprising thousands of servers, hundreds of petabytes of data, and core services HDFS, YARN, and Hive. Access was mediated by the Firework client, which encapsulated open‑source tools and provided dynamic configuration and version updates.
Upgrade Scope and Timeline
The upgrade covered most components—Cloudera, HDFS, YARN, Hive, HBase, Zookeeper, Sentry—moving from CDH5.7.3 to CDH6.3.3 (Hadoop 3.0.0). Testing began in Q2 2021 and production rollout occurred in July after four months of validation.
Compatibility Considerations
Four compatibility dimensions were examined:
Client‑service interface compatibility
Inter‑service component compatibility
Component‑storage state compatibility
User‑interface syntax and semantics compatibility
Key issues discovered included:
HDFS Block Access Token schema change (HDFS‑6708) requiring a patch (HDFS‑15191) on Hadoop 2.6.
Datanode directory hash restructuring (HDFS‑8791) necessitating pre‑upgrade block relocation.
Changes to HDFS chmod sticky‑bit handling (HDFS‑10689) and heap‑size environment variables.
YARN token identifier serialization shift to Protocol Buffers (YARN‑668) with a backward‑compatible patch (YARN‑8310) that still required a cache for original byte arrays.
Hive 2.1 metadata schema changes and numerous SQL syntax deprecations, prompting temporary keyword reverts.
Impact on Spark and Flink
Most production Spark jobs used Hive 1.x and Hadoop 2.x. Upgrading to Hadoop 3.x and Hive 2.1 introduced library conflicts; patches HIVE‑15016, HIVE‑16081, and HIVE‑16131 were applied to Hive 1.x, and JvmPauseMonitor interface changes were fixed. Flink worked with Hive 2.x and Hadoop 3.x without issues.
Classloader and SPI Challenges
During the upgrade, Spark and Flink classloader hierarchies were analyzed. The default
URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory())in Spark loaded FileSystem providers via the Service Provider Interface (SPI). Because Hadoop 2.6 and 3.0 provided different HttpFileSystem implementations, the merged SPI configuration caused HTTP URLs to be handled by the Hadoop 3 provider, breaking non‑HDFS HTTP accesses.
To mitigate this, a forward‑compatible Spark/Flink runtime containing full Hadoop 2 dependencies was deployed, ensuring that user jobs could run on Hadoop 3 clusters while preserving interface compatibility.
Upgrade Procedure
The upgrade proceeded in three phases:
Upgrade Cloudera, Sentry, Zookeeper (no downtime).
Stop Hive services, upgrade Hive metadata, then restart; YARN was also stopped and rebuilt.
Perform a rolling upgrade of HDFS (JournalNode → NameNode → Datanode), taking roughly two hours per namespace and three weeks for full Datanode rollout.
Outcome and Future Work
The migration was largely successful, providing deeper insight into the big‑data stack. Remaining gaps to the latest Hadoop 3.3 include performance tuning, container‑based isolation, and further dependency management. Future directions involve tighter cloud integration and continued Spark version upgrades.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Hulu Beijing
Follow Hulu's official WeChat account for the latest company updates and recruitment information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
