Weekly Hadoop Knowledge Points: Compression Formats, MapReduce Join, Hive Setup, and YARN Capacity Scheduler
This weekly bulletin summarizes four Hadoop knowledge points—compression formats, MapReduce join techniques, Hive installation, and YARN Capacity Scheduler—while also sharing personal updates about a PhD graduation, the upcoming May Day holiday, and a request for likes and shares.
Today is the Grain Rain solar term. This weekly newsletter presents four Hadoop‑related knowledge points.
01. Compression formats supported by Hadoop – an overview of common formats such as Gzip, LZO, Snappy, and Bzip2, their advantages, disadvantages, and typical use cases.
02. MapReduce Join – explanation of map‑side and reduce‑side joins, with two code snippets that can be used directly in projects and are frequently asked about in interviews.
03. Setting up Hive on Hadoop – a step‑by‑step guide to installing Hive, providing a foundation for future Hive learning.
04. YARN Capacity Scheduler – description of the scheduler’s features and common configuration options.
On Friday no article was posted because the author stayed up late chatting with his brother, who just earned his PhD and was selected for the 2019 National Postdoctoral Innovation Talent Support Program (only 400 nationwide).
Next week the May Day holiday begins; 2019 is already halfway through, and time flies.
Likes and shares are the greatest support~
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
