What’s New in Hadoop 3.0? Key Features and Improvements Explained
Hadoop 3.0, built on JDK 1.8, adds erasure‑coded HDFS, multi‑NameNode support, native MapReduce task optimizations, cgroup‑based YARN memory and disk isolation, and container resizing, with an alpha slated for summer and a GA release expected in November or December.
1. Hadoop 3.0 Overview
Hadoop 2.0 was built on JDK 1.7, which stopped updating in April 2015, prompting the community to release a new version based on JDK 1.8—Hadoop 3.0.
The Hadoop 3.0 alpha is expected to be released in summer, with the GA version slated for November or December.
Hadoop 3.0 introduces important features and optimizations, including erasure‑coded HDFS, multiple NameNode support, MR native task optimizations, YARN memory and disk I/O isolation based on cgroup, and YARN container resizing.
2. New Features in Hadoop 3.0
2.1 Hadoop Common
Streamlined core by removing deprecated APIs and implementations, replacing default components with more efficient versions (e.g., FileOutputCommitter v2, webhdfs replacing hftp, removal of org.apache.hadoop.Records).
Classpath isolation to prevent jar conflicts between Hadoop, HBase, Spark, etc.
Shell script refactoring, fixing many bugs and adding dynamic command support.
2.2 Hadoop HDFS
Support for erasure coding, reducing storage by half without compromising reliability.
Multiple NameNode support, allowing one active and multiple standby NameNodes in a cluster (multi‑ResourceManager already supported in Hadoop 2.0).
2.3 Hadoop MapReduce
Task native optimization: C/C++ map output collector implementation (Spill, Sort, IFile) improves shuffle‑intensive job performance by about 30%.
Automatic inference of memory parameters, simplifying configuration of mapreduce.{map,reduce}.memory.mb and mapreduce.{map,reduce}.java.opts.
2.4 Hadoop YARN
cgroup‑based memory and disk I/O isolation.
Curator‑based ResourceManager leader election.
Container resizing support.
Next‑generation TimelineServer.
3. Hadoop 3.0 Summary
Hadoop 3.0’s alpha is expected this summer, with GA in November or December, bringing major enhancements such as erasure‑coded HDFS, multi‑NameNode, native MapReduce tasks, and improved YARN isolation and container management.
Hulu Beijing
Follow Hulu's official WeChat account for the latest company updates and recruitment information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.