Tagged articles
191 articles
Page 1 of 2
Raymond Ops
Raymond Ops
Jan 30, 2026 · Big Data

Build an Enterprise‑Grade HDFS HA and YARN Scheduler from Scratch

This guide walks you through designing and deploying a highly available HDFS architecture with dual NameNodes, ZooKeeper‑based failover, and a tuned YARN resource scheduler, covering detailed configuration files, failover testing, performance tuning, monitoring, automated health checks, capacity planning, and best‑practice checklists for production‑grade big‑data platforms.

Big DataHAHDFS
0 likes · 28 min read
Build an Enterprise‑Grade HDFS HA and YARN Scheduler from Scratch
MaGe Linux Operations
MaGe Linux Operations
Sep 8, 2025 · Big Data

Build Enterprise‑Grade HDFS HA and Optimize YARN Scheduling from Scratch

This comprehensive guide walks you through constructing a fault‑tolerant HDFS high‑availability architecture, configuring dual NameNodes with ZooKeeper and JournalNode clusters, fine‑tuning YARN resource schedulers, implementing monitoring, automated failover testing, and performance optimization, all backed by real‑world production experiences and code examples.

Big Data OperationsHDFSYARN
0 likes · 24 min read
Build Enterprise‑Grade HDFS HA and Optimize YARN Scheduling from Scratch
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Aug 29, 2025 · Fundamentals

Understanding Distributed Storage: HDFS, CephFS, GlusterFS, and FastDFS Compared

This article compares four major distributed storage solutions—HDFS, CephFS, GlusterFS, and FastDFS—detailing their architectures, strengths, weaknesses, and ideal use cases for big‑data processing, cloud-native environments, and high‑concurrency file services, and how they fit into modern infrastructure strategies.

Big DataCephFSFastDFS
0 likes · 5 min read
Understanding Distributed Storage: HDFS, CephFS, GlusterFS, and FastDFS Compared
Big Data Tech Team
Big Data Tech Team
Jun 8, 2025 · Big Data

Master Hadoop: A Step-by-Step Learning Roadmap for Big Data Professionals

This guide outlines a comprehensive Hadoop learning roadmap, covering essential prerequisites, core concepts such as HDFS, MapReduce, and YARN, hands‑on projects, advanced ecosystem tools like Hive, Pig, HBase and Spark, plus curated resources and community channels for aspiring big‑data engineers.

HDFSHadoopMapReduce
0 likes · 7 min read
Master Hadoop: A Step-by-Step Learning Roadmap for Big Data Professionals
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
May 9, 2025 · Big Data

Mastering Multi‑AZ Replication in HDFS with AZ Mover

This article introduces AZ Mover, a lightweight HDFS client‑side tool that intelligently scans, schedules, and migrates block replicas across multiple availability zones, detailing its design goals, core workflow, command‑line options, concurrency controls, and future enhancements for robust big‑data disaster recovery.

AZ MoverData GovernanceHDFS
0 likes · 9 min read
Mastering Multi‑AZ Replication in HDFS with AZ Mover
IT Services Circle
IT Services Circle
Feb 9, 2025 · Big Data

Understanding HDFS: Architecture, Data Blocks, Fault Tolerance, and High Availability

This article explains how HDFS, the Hadoop Distributed File System, splits large files into blocks, replicates them for fault tolerance, organizes the cluster into NameNode and DataNode components, and provides high‑availability and scalability mechanisms such as standby NameNode and federation, enabling reliable big‑data storage and access.

Big DataDataNodeDistributed File System
0 likes · 11 min read
Understanding HDFS: Architecture, Data Blocks, Fault Tolerance, and High Availability
JD Retail Technology
JD Retail Technology
Oct 29, 2024 · Big Data

JD Unified Storage Practice: Cross‑Region and Tiered Storage on HDFS

This article details JD's large‑scale HDFS unified storage implementation, covering cross‑region storage challenges, topology design, asynchronous block replication, flow‑control mechanisms, tiered storage strategies, automatic hot‑cold data migration, and the resulting performance and cost improvements for big‑data workloads.

Big DataCross-Region StorageData Management
0 likes · 20 min read
JD Unified Storage Practice: Cross‑Region and Tiered Storage on HDFS
DataFunSummit
DataFunSummit
Oct 4, 2024 · Big Data

JD Retail HDFS Unified Storage: Cross‑Region and Tiered Storage Practices

This article presents JD Retail's large‑scale HDFS deployment, detailing its unified storage architecture, cross‑region data replication challenges and solutions, tiered storage strategies for hot, warm and cold data, and the operational modules that together improve performance, reliability and cost efficiency in a big‑data environment.

Big DataCross-Region StorageDistributed File System
0 likes · 21 min read
JD Retail HDFS Unified Storage: Cross‑Region and Tiered Storage Practices
dbaplus Community
dbaplus Community
Sep 4, 2024 · Big Data

How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn

This article details how Ctrip’s data platform evolved from a single‑IDC design to a multi‑IDC, tiered storage and scheduling architecture, covering the challenges of rapid data growth, the migration to Spark 3 via Kyuubi, the introduction of Celeborn shuffle service, and the resulting performance and reliability gains.

Big DataHDFSKyuubi
0 likes · 23 min read
How Ctrip Scaled Its Data Platform to Multi‑IDC Architecture with Spark 3, Kyuubi, and Celeborn
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Aug 8, 2024 · Big Data

How to Migrate HBase and HDFS Clusters Safely Without Downtime

This guide details a step‑by‑step migration plan for HBase and HDFS clusters, covering background, high‑availability architecture, role assignments, expansion and shrinkage of ZooKeeper and JournalNode, NameNode and DataNode migration, rolling restarts, and common upgrade pitfalls.

Big DataCluster MigrationHBase
0 likes · 12 min read
How to Migrate HBase and HDFS Clusters Safely Without Downtime
WeiLi Technology Team
WeiLi Technology Team
Jun 28, 2024 · Big Data

How to Build a Robust Big Data Monitoring and Alerting System

This article explains why high‑availability design and comprehensive monitoring are essential for modern big‑data platforms, outlines a layered architecture, and provides practical guidance on health checks, alerting, and data‑quality monitoring across storage, compute, scheduling, and service layers.

ArchitectureFlinkHDFS
0 likes · 14 min read
How to Build a Robust Big Data Monitoring and Alerting System
DataFunTalk
DataFunTalk
May 27, 2024 · Big Data

JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices

This article details JD Retail’s large‑scale HDFS deployment, describing how cross‑region storage challenges were solved with a full‑copy topology, asynchronous block replication, flow‑control mechanisms, and a tiered storage strategy that automatically moves hot, warm, and cold data among SSD, HDD, and high‑density HDD nodes to improve performance and cut costs.

Big DataData ManagementHDFS
0 likes · 20 min read
JD Retail’s Unified HDFS Storage: Cross‑Region and Hierarchical Storage Practices
Bilibili Tech
Bilibili Tech
Apr 26, 2024 · Big Data

Fine-Grained Lock Optimization for HDFS NameNode to Improve Metadata Read/Write Performance

To overcome the NameNode write bottleneck caused by a single global read/write lock in Bilibili’s massive HDFS deployment, the team introduced hierarchical fine‑grained locking—splitting the lock into Namespace, BlockPool, and per‑INode levels—which yielded up to three‑fold write throughput gains, a 90 % drop in RPC queue time, and shifted performance limits from lock contention to log synchronization.

Big DataHDFSNameNode
0 likes · 15 min read
Fine-Grained Lock Optimization for HDFS NameNode to Improve Metadata Read/Write Performance
Efficient Ops
Efficient Ops
Apr 23, 2024 · Big Data

How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes

This guide walks through planning a three‑node Hadoop 3.3.5 cluster, explains default and custom configuration files, details core‑site, hdfs‑site, yarn‑site, and mapred‑site settings, shows how to distribute configs, start HDFS and YARN, and perform basic file‑system tests.

Big DataCluster SetupHDFS
0 likes · 11 min read
How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes
Linux Code Review Hub
Linux Code Review Hub
Mar 11, 2024 · Databases

How Didi Built a Next‑Gen Log Storage System with ClickHouse

Didi migrated its massive PB‑scale log data from Elasticsearch to ClickHouse, redesigning storage with separate Log and Trace clusters, optimizing partition and sorting keys, introducing native TCP connectors, and revamping HDFS cold‑hot separation, achieving up to four‑fold query speed gains and 30% lower hardware costs.

ClickHouseDistributed SystemsFlink
0 likes · 15 min read
How Didi Built a Next‑Gen Log Storage System with ClickHouse
DataFunSummit
DataFunSummit
Feb 6, 2024 · Big Data

Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices

This article presents an in‑depth overview of ByteDance's EB‑scale HDFS, covering its new features, multi‑datacenter architecture, tiered storage implementation, data management services, capacity and fault‑tolerance strategies, as well as practical data‑protection mechanisms and related Q&A.

Big DataData ProtectionHDFS
0 likes · 22 min read
Exploring ByteDance's EB‑Scale HDFS: Architecture, Multi‑Datacenter Challenges, Tiered Storage, and Data Protection Practices
WeiLi Technology Team
WeiLi Technology Team
Nov 1, 2023 · Big Data

How to Diagnose and Resolve HDFS Safe Mode Issues

This guide explains why HDFS enters safe mode after a DataNode failure, describes the safe‑mode state and its exit conditions, and provides step‑by‑step commands and troubleshooting procedures to analyze, fix, and recover from safe‑mode incidents in Hadoop clusters.

Big DataCluster ManagementHDFS
0 likes · 10 min read
How to Diagnose and Resolve HDFS Safe Mode Issues
Su San Talks Tech
Su San Talks Tech
Oct 29, 2023 · Operations

What Are the Best Distributed File Storage Systems and How to Choose One?

This article introduces the concept of distributed storage, outlines its key advantages, reviews major distributed file systems such as GFS, HDFS, Ceph, Lustre, TFS, FastDFS, and GridFS, explains POSIX basics, and provides practical criteria for selecting the most suitable system for different workloads.

CephHDFSSelection Guide
0 likes · 12 min read
What Are the Best Distributed File Storage Systems and How to Choose One?
政采云技术
政采云技术
Apr 18, 2023 · Big Data

Implementing Data Cost Governance: Quantifying Storage and Compute Expenses with Hive, Spark, and HDFS FsImage

This article explains how to perform task‑level data cost governance by collecting storage and compute metrics from Hive tables, Spark jobs, and HDFS FsImage files, then estimating monthly expenses using replication factors and resource‑usage rates, while providing practical SQL and shell examples.

Data Cost GovernanceHDFSSpark
0 likes · 18 min read
Implementing Data Cost Governance: Quantifying Storage and Compute Expenses with Hive, Spark, and HDFS FsImage
Bilibili Tech
Bilibili Tech
Mar 14, 2023 · Big Data

Bilibili HDFS Erasure Coding Strategy and Implementation

Bilibili reduced petabyte‑scale storage costs by back‑porting erasure‑coding patches to its HDFS 2.8.4 cluster, deploying a parallel EC‑enabled cluster, adding a data‑proxy service, intelligent routing and block‑checking, and automating cold‑data migration, while noting write overhead and planning native acceleration.

Big DataData ReliabilityDistributed Systems
0 likes · 14 min read
Bilibili HDFS Erasure Coding Strategy and Implementation
DataFunTalk
DataFunTalk
Feb 18, 2023 · Big Data

Xiaomi Data Governance Evolution: Cost Governance Practices for HDFS and HBase

The article outlines Xiaomi's data governance journey, focusing on storage‑service cost governance, describing the transition from simple cost‑centered governance to big‑data‑driven asset management, and detailing concrete HDFS and HBase practices that achieved significant resource and cost reductions.

Big DataData GovernanceHBase
0 likes · 15 min read
Xiaomi Data Governance Evolution: Cost Governance Practices for HDFS and HBase
dbaplus Community
dbaplus Community
Feb 8, 2023 · Big Data

How Bilibili Scaled Offline Processing Across Multiple Data Centers

This article details Bilibili's multi‑datacenter offline architecture, explaining the capacity challenges, the chosen scale‑out design, and the implementation of job placement, data replication, routing, versioning, throttling, and traffic analysis to efficiently handle massive batch workloads across geographically distributed clusters.

HDFSbandwidth optimizationdata replication
0 likes · 26 min read
How Bilibili Scaled Offline Processing Across Multiple Data Centers
ITPUB
ITPUB
Jan 4, 2023 · Databases

Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases

The article traces the evolution from Codd's relational model to modern RDBMS scaling limits, explains why centralized Hadoop/HBase architectures struggle with high‑concurrency workloads, and shows how Cassandra’s decentralized design—using consistent hashing, gossip, and virtual nodes—overcomes these bottlenecks while offering flexible consistency guarantees.

ConsistencyHBaseHDFS
0 likes · 22 min read
Can Cassandra Beat RDBMS Distributed Bottlenecks? A Deep Dive into Decentralized Databases
High Availability Architecture
High Availability Architecture
Nov 30, 2022 · Big Data

Design and Implementation of Vivo's Bees Log Collection Agent

This article presents the design principles, core features, and implementation details of Vivo's self‑developed Bees log collection agent, covering file discovery, unique identification, real‑time and offline ingestion, resource control, platform management, and comparisons with open‑source solutions.

HDFSKafkaResource Management
0 likes · 22 min read
Design and Implementation of Vivo's Bees Log Collection Agent
vivo Internet Technology
vivo Internet Technology
Nov 23, 2022 · Big Data

Design and Implementation of Vivo's Bees Log Collection Agent

Vivo’s Bees‑agent is a custom, lightweight log‑collection service that discovers rotating files via inotify, uniquely identifies them with inode and hash signatures, supports real‑time and offline ingestion to Kafka and HDFS, offers checkpoint‑resume, resource isolation, rich metrics, and a centralized management platform, outperforming open‑source collectors in latency, memory usage, and scalability.

Agent DesignHDFSKafka
0 likes · 24 min read
Design and Implementation of Vivo's Bees Log Collection Agent
ITPUB
ITPUB
Oct 21, 2022 · Big Data

Hadoop Explained: Architecture, Core Components, and Real-World Applications

This article provides a comprehensive overview of Hadoop, covering its historical development, key characteristics, the HDFS storage framework, the MapReduce processing engine, YARN resource manager, and a wide range of real-world application scenarios, as well as the broader Hadoop ecosystem and its major components.

Big DataEcosystemHDFS
0 likes · 20 min read
Hadoop Explained: Architecture, Core Components, and Real-World Applications
ITPUB
ITPUB
Oct 20, 2022 · Big Data

Will HDFS Be Replaced? Analyzing Its Drawbacks and Future Alternatives

The article examines why Hadoop's Distributed File System may become obsolete by detailing its three main shortcomings—deployment complexity, metadata memory limits, and high replication overhead—and explores how newer architectures and erasure coding could address these issues.

Big DataDistributed File SystemHDFS
0 likes · 8 min read
Will HDFS Be Replaced? Analyzing Its Drawbacks and Future Alternatives
DataFunTalk
DataFunTalk
Sep 4, 2022 · Big Data

Design and Implementation of Bilibili's Offline Multi‑Datacenter Solution

This article describes Bilibili's offline multi‑datacenter architecture, explaining why a scale‑out approach was chosen over scale‑up, and detailing the unit‑based design, job placement, data replication, routing, versioning, bandwidth throttling, traffic analysis, and the operational results and future directions.

Big DataHDFSJob Scheduling
0 likes · 24 min read
Design and Implementation of Bilibili's Offline Multi‑Datacenter Solution
ITPUB
ITPUB
Jul 23, 2022 · Information Security

How Bilibili Secured Hadoop: Ranger‑Based HDFS and Hive Access Control Deep Dive

This article details Bilibili's implementation of Apache Ranger for fine‑grained access control across Hadoop, HDFS, Hive, Spark, and Presto, covering architecture, API redesign, admin optimizations, gray‑release strategies, permission pre‑checks, data masking, and future plans for incremental policy loading.

HDFSPrestoSpark
0 likes · 16 min read
How Bilibili Secured Hadoop: Ranger‑Based HDFS and Hive Access Control Deep Dive
Bilibili Tech
Bilibili Tech
Jul 22, 2022 · Information Security

Design and Optimization of Ranger‑Based Access Control for HDFS and Hive in Bilibili's Data Platform

Bilibili’s data platform redesigns Ranger‑based access control by simplifying HDFS and Hive policy APIs, parallelizing policy loading, adding gray‑release and pre‑check mechanisms, integrating fine‑grained Hive authorization with data‑masking, extending support to Spark and Presto, and planning incremental loading, policy fusion, and a NameNode proxy to boost security and performance.

HDFSPrestoSpark
0 likes · 15 min read
Design and Optimization of Ranger‑Based Access Control for HDFS and Hive in Bilibili's Data Platform
dbaplus Community
dbaplus Community
Jul 13, 2022 · Big Data

Unpacking the Core Technologies Behind Modern Big Data Platforms

From data ingestion to real‑time analytics, this guide breaks down the essential layers of a typical big‑data platform—covering collection methods, HDFS storage, Hive/Spark analysis, data sharing mechanisms, application use‑cases, streaming with Spark Streaming, and the need for robust scheduling and monitoring.

Big DataData IntegrationHDFS
0 likes · 9 min read
Unpacking the Core Technologies Behind Modern Big Data Platforms
ITPUB
ITPUB
Jul 13, 2022 · Big Data

How Bilibili Scaled Offline Processing Across Multiple Data Centers

This article details Bilibili's multi‑datacenter solution for offline big‑data workloads, covering the challenges of capacity limits, the design of a unit‑based architecture, job placement, data replication, routing, versioning, bandwidth throttling, traffic analysis, and future directions.

HDFSbandwidth optimizationjob placement
0 likes · 29 min read
How Bilibili Scaled Offline Processing Across Multiple Data Centers
Bilibili Tech
Bilibili Tech
Jul 5, 2022 · Big Data

Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili

To overcome rapid data growth and on‑premise capacity limits, Bilibili adopted a scale‑out, unit‑based multi‑datacenter architecture that isolates failures, intelligently places jobs, replicates data via an enhanced DistCp service, routes reads with an IP‑aware HDFS router, and throttles cross‑site traffic, enabling stable offline big‑data processing of hundreds of petabytes while preserving throughput.

HDFSYARNbandwidth optimization
0 likes · 28 min read
Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili
DataFunTalk
DataFunTalk
Jul 4, 2022 · Big Data

Apache Ozone: Architecture, Advantages, and New Features Overcoming HDFS Limitations

This article explains the shortcomings of HDFS at large scale, describes the Federation and Scaling approaches, and details how Apache Ozone redesigns metadata storage, introduces container abstraction, object semantics, and new features such as optimized OM, streaming writes, erasure coding, and RocksDB consolidation to improve scalability and performance.

Apache OzoneHDFSRocksDB
0 likes · 11 min read
Apache Ozone: Architecture, Advantages, and New Features Overcoming HDFS Limitations
DataFunSummit
DataFunSummit
Jul 2, 2022 · Big Data

Technical Evolution and Optimization of Kuaishou HDFS

Over the past four years Kuaishou's data grew dozens of times, prompting scalability and storage‑cost challenges, and this article details the architectural evolution, performance and cost optimizations, cross‑region expansion, and future plans of Kuaishou's HDFS system.

Big DataHDFSPerformance
0 likes · 20 min read
Technical Evolution and Optimization of Kuaishou HDFS
DataFunTalk
DataFunTalk
Jun 5, 2022 · Big Data

JD Big Data Platform: Cross‑Region and Tiered Storage Architecture and Practices

This article presents JD's large‑scale big‑data platform, detailing its overall architecture, the challenges of cross‑region storage, the design of a unified cross‑domain data synchronization mechanism, and the implementation of tiered storage to improve performance, cost efficiency, and data reliability across multi‑datacenter clusters.

Big DataData PlatformHDFS
0 likes · 15 min read
JD Big Data Platform: Cross‑Region and Tiered Storage Architecture and Practices
ITPUB
ITPUB
May 7, 2022 · Big Data

How eBay Scaled HDFS to 800 PB Using Federation and Router‑Based Architecture

This article details eBay's evolution of its massive HDFS storage—from a single‑cluster design to ViewFS Federation, then to Router‑Based Federation—highlighting the performance bottlenecks, optimization techniques, FastCopy integration, and future plans for further scaling and automation.

FederationHDFSPerformance Optimization
0 likes · 11 min read
How eBay Scaled HDFS to 800 PB Using Federation and Router‑Based Architecture
DataFunSummit
DataFunSummit
May 4, 2022 · Big Data

NetEase Big Data Platform: HDFS Optimization and Practices

NetEase’s senior big‑data engineer shares how the company’s large‑scale data platform leverages Hadoop, HDFS, YARN and related technologies, detailing multi‑layer architecture, cross‑cloud deployment, storage optimizations, NameNode performance enhancements, RPC prioritization, and practical lessons from operating petabyte‑scale clusters.

Cluster OptimizationHDFSPerformance Tuning
0 likes · 23 min read
NetEase Big Data Platform: HDFS Optimization and Practices
DataFunTalk
DataFunTalk
Mar 30, 2022 · Big Data

NetEase Big Data Platform: HDFS Optimization and Practice

This article presents NetEase's big data platform architecture, detailing multi‑layer storage and compute design, HDFS deployment challenges, NameNode and NameSpace performance optimizations, cluster scaling strategies, data tiering, hardware upgrades, and real‑world business use cases, illustrating practical large‑scale big data engineering.

Big DataCluster OptimizationData Management
0 likes · 23 min read
NetEase Big Data Platform: HDFS Optimization and Practice
Bilibili Tech
Bilibili Tech
Mar 30, 2022 · Big Data

HDFS Architecture, Optimizations, and Future Plans at Bilibili

Bilibili’s HDFS now runs a three‑tier architecture—access, metadata, and data layers—enhanced with a custom MergeFS router, observer NameNode, dynamic load balancing, fast‑failover pipelines, and storage‑aware policies, while future work targets transparent erasure coding, tiered data routing, lock refinements, and a Hadoop 3.x migration.

Big DataDistributed File SystemHDFS
0 likes · 22 min read
HDFS Architecture, Optimizations, and Future Plans at Bilibili
dbaplus Community
dbaplus Community
Dec 15, 2021 · Big Data

How We Migrated Hundreds of Petabytes of Hadoop Data Without Downtime

This article details the background, challenges, and step‑by‑step solutions for migrating over a hundred petabytes of Hadoop HDFS data across data centers within a month, covering strategy selection, code modifications, balance optimization, and tool enhancements.

Balance OptimizationBig Data OperationsData Migration
0 likes · 14 min read
How We Migrated Hundreds of Petabytes of Hadoop Data Without Downtime
HomeTech
HomeTech
Dec 7, 2021 · Big Data

Flink Task Auto-scaling Design and Implementation

This article presents the design and implementation of Flink task auto‑scaling, covering background, manual and automatic scaling mechanisms, architecture with RescaleCoordinator, persistence via Zookeeper and HDFS, scaling policies for parallelism, CPU and memory, and future plans for fine‑grained and time‑based resource adjustments.

Auto ScalingFlinkHDFS
0 likes · 4 min read
Flink Task Auto-scaling Design and Implementation
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 8, 2021 · Big Data

Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide

This article provides a comprehensive guide to optimizing Hadoop HDFS storage through erasure coding and heterogeneous storage policies, explains fault‑tolerance techniques such as safe mode and slow‑disk monitoring, and shares practical MapReduce performance tuning and enterprise‑level configuration examples for large‑scale clusters.

Cluster TuningHDFSHadoop
0 likes · 32 min read
Hadoop HDFS Storage Optimization, Erasure Coding, Heterogeneous Storage, and Cluster Tuning Guide
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 17, 2021 · Big Data

Key Reliability Mechanisms of HDFS, YARN Failover Strategies, and Hadoop Shuffle Process

This article explains HDFS reliability features such as replica policies, rack awareness, heartbeat, safe mode, checksums, trash, metadata protection and snapshots, then details YARN failover handling for ApplicationMaster, NodeManager and ResourceManager, and finally describes the Hadoop MapReduce shuffle workflow and tuning tips.

HDFSMapReduceReliability
0 likes · 13 min read
Key Reliability Mechanisms of HDFS, YARN Failover Strategies, and Hadoop Shuffle Process
ITPUB
ITPUB
Sep 16, 2021 · Big Data

Understanding Hadoop: Architecture, HDFS, MapReduce, and Their Pros & Cons

This article explains how Hadoop revolutionized big data by providing a distributed architecture with HDFS for storage and MapReduce for processing, outlines its ecosystem components, describes the inner workings of HDFS and MapReduce, and discusses the strengths and limitations of this approach.

HDFSHadoopMapReduce
0 likes · 7 min read
Understanding Hadoop: Architecture, HDFS, MapReduce, and Their Pros & Cons
The Dominant Programmer
The Dominant Programmer
Aug 2, 2021 · Big Data

How to Build a Beginner Hadoop Cluster on CentOS 7

This article introduces Apache Hadoop’s open‑source framework, explains its core components such as HDFS, MapReduce, ZooKeeper, HBase, Hive, Pig, Mahout, Sqoop, Flume, Chukwa, Oozi​e, Ambari and YARN, and outlines the steps to set up a beginner‑level Hadoop cluster on CentOS 7.

Big DataCentOS 7HBase
0 likes · 11 min read
How to Build a Beginner Hadoop Cluster on CentOS 7
DataFunTalk
DataFunTalk
Jul 8, 2021 · Big Data

Design and Evolution of ByteDance's Multi‑Datacenter HDFS Architecture

This article explains how ByteDance extended the Apache HDFS architecture with a multi‑datacenter design, introducing components such as DanceNN, NNProxy, and BookKeeper to achieve scalable storage, cross‑datacenter data placement, and rack‑level disaster recovery for petabyte‑scale workloads.

ByteDanceHDFSbig data storage
0 likes · 13 min read
Design and Evolution of ByteDance's Multi‑Datacenter HDFS Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 24, 2021 · Big Data

Comprehensive Overview of HBase Architecture, Design, and Operations

This article provides an in‑depth technical overview of HBase, covering its Bigtable origins, distributed column‑store design, core components such as ZooKeeper, HMaster and RegionServer, data flow, storage formats, row‑key design, bulk loading, SQL integration, indexing, coprocessors, and performance tuning for big‑data environments.

Columnar DatabaseHBaseHDFS
0 likes · 30 min read
Comprehensive Overview of HBase Architecture, Design, and Operations
58 Tech
58 Tech
May 28, 2021 · Big Data

Practical Upgrade Experience of Hadoop 3.2.1 in 58.com Data Platform: HDFS, YARN, and MR3

This article details the end‑to‑end upgrade of a 5000‑node Hadoop 2.6.0 cluster to Hadoop 3.2.1 at 58.com, covering HDFS migration, RBF and EC adoption, Yarn federation and rolling upgrades, MR3 integration, extensive compatibility testing, and operational lessons learned for large‑scale big‑data platforms.

Big DataCluster UpgradeHDFS
0 likes · 19 min read
Practical Upgrade Experience of Hadoop 3.2.1 in 58.com Data Platform: HDFS, YARN, and MR3
Qu Tech
Qu Tech
May 6, 2021 · Big Data

How JuiceFS Cut HDFS Load by 26% and Boost Presto Query Speed 13%

This case study details how integrating JuiceFS with Presto reduced HDFS cluster load by about 26%, achieved over 90% cache hit rate for ad‑hoc queries, and lowered average query latency by roughly 13%, while simplifying operations and improving system stability.

Big DataCacheHDFS
0 likes · 9 min read
How JuiceFS Cut HDFS Load by 26% and Boost Presto Query Speed 13%
Programmer DD
Programmer DD
Apr 13, 2021 · Big Data

What Makes HDFS the Backbone of Big Data? Overview, Architecture & Key Features

This article provides a comprehensive overview of HDFS—including its design goals, core components, data read/write workflows, high‑availability mechanisms, federation, storage policies, colocation benefits, and practical usage scenarios—explaining why it is the foundational distributed file system for large‑scale data processing.

Big DataFederationHDFS
0 likes · 17 min read
What Makes HDFS the Backbone of Big Data? Overview, Architecture & Key Features
DataFunTalk
DataFunTalk
Mar 27, 2021 · Big Data

Kuaishou's HDFS Architecture, Scale, Challenges, and Practices

This article presents an in‑depth technical overview of Kuaishou's massive HDFS deployment, detailing its architecture, petabyte‑scale data and thousands‑of‑node clusters, the key scalability challenges faced, and the custom solutions—including FixedOrder, RBF balancer, observer read, slow‑node mitigation, and tiered protection—implemented to keep the system performant and reliable.

Big DataHDFSKuaishou
0 likes · 12 min read
Kuaishou's HDFS Architecture, Scale, Challenges, and Practices
Big Data Technology Architecture
Big Data Technology Architecture
Mar 25, 2021 · Big Data

Implementing Erasure Coding in HDFS: Migration, Testing, and Data Lifecycle Management at JD

This article details JD's end‑to‑end implementation of HDFS erasure coding, covering the migration from replication to EC, the three‑phase upgrade and rollback process, comprehensive automated testing, a custom data‑lifecycle management system for hot‑warm‑cold data, and multi‑layer integrity safeguards to achieve significant storage cost reduction while maintaining reliability.

Data LifecycleHDFSStorage Optimization
0 likes · 17 min read
Implementing Erasure Coding in HDFS: Migration, Testing, and Data Lifecycle Management at JD
JD Tech
JD Tech
Mar 20, 2021 · Big Data

Implementing Erasure Coding in HDFS: Migration Strategy, Testing Framework, and Data Lifecycle Management

This article details JD's practical experience migrating HDFS to erasure coding, covering the decision between upgrade and porting, the step‑by‑step upgrade and rollback procedures, automated testing, a custom data‑lifecycle management system for hot‑warm‑cold data, and comprehensive data‑integrity safeguards to achieve significant storage cost reductions while maintaining production reliability.

Cluster UpgradeData Lifecycle ManagementHDFS
0 likes · 17 min read
Implementing Erasure Coding in HDFS: Migration Strategy, Testing Framework, and Data Lifecycle Management
dbaplus Community
dbaplus Community
Mar 17, 2021 · Big Data

How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration

This article details a three‑part technical sharing that covers cost governance for offline Hadoop clusters, a large‑scale data‑center migration with architecture upgrades, and a tiered storage strategy using EC and COS to reduce storage costs and improve performance in a cloud‑native big‑data environment.

Big Data MigrationCOSCloud Native
0 likes · 10 min read
How We Cut PBs of Waste and Optimized HDFS with Tiered Storage and Cloud Migration
Big Data Technology Architecture
Big Data Technology Architecture
Mar 2, 2021 · Big Data

Understanding and Managing Small Files in Hadoop HDFS

This article explains what small files are in Hadoop HDFS, how they degrade NameNode memory, RPC performance, and application throughput, and provides practical strategies—including detection, configuration, and merging techniques—to mitigate their impact on storage and processing layers.

HDFSHadoop
0 likes · 12 min read
Understanding and Managing Small Files in Hadoop HDFS
DataFunTalk
DataFunTalk
Feb 8, 2021 · Big Data

Ozone: The Next‑Generation Distributed Storage System Aiming to Replace HDFS

This article explains how Apache Ozone, built on the HDDS layer, addresses the scalability, memory, and performance limitations of HDFS by splitting metadata services, using RocksDB, implementing fine‑grained locking, RAFT‑based HA, and offering rich APIs, while outlining current challenges and future roadmap.

Big DataHDDSHDFS
0 likes · 29 min read
Ozone: The Next‑Generation Distributed Storage System Aiming to Replace HDFS
Didi Tech
Didi Tech
Jan 22, 2021 · Big Data

Erasure Coding Practice in HDFS at Didi: Principles, Implementation, and Lessons Learned

Didi migrated HDFS to Hadoop 3.2 and implemented erasure coding—using XOR and Reed‑Solomon RS(6,3) striping—to replace three‑replica storage for cold data, building back‑ported clients, automated conversion tools, and cross‑datacenter backup pipelines, while addressing operational bugs and noting performance trade‑offs.

Big DataDidiHDFS
0 likes · 11 min read
Erasure Coding Practice in HDFS at Didi: Principles, Implementation, and Lessons Learned
Big Data Technology & Architecture
Big Data Technology & Architecture
Jan 22, 2021 · Big Data

Key New Features and Improvements in Hadoop 3.x

Hadoop 3.x upgrades the platform to JDK 1.8 and introduces a range of enhancements across common components, HDFS, YARN, and MapReduce, including erasure coding, multi‑NameNode high availability, cgroup‑based resource isolation, native map‑output collectors, and split client libraries, while also adding support for Azure and Aliyun distributed file systems.

HDFSHadoopMapReduce
0 likes · 7 min read
Key New Features and Improvements in Hadoop 3.x
dbaplus Community
dbaplus Community
Dec 22, 2020 · Big Data

How eBay Migrated 10 PB of HDFS Data Across Namespaces in Just 2 Hours

This article details how eBay's ADI Hadoop team tackled a massive 10 PB, 10‑million‑file migration by optimizing DistCp with Fastcopy, load‑balancing, ACL handling, and failure recovery, ultimately completing the transfer within a two‑hour window while preserving cluster stability and performance.

Big DataDistcpHDFS
0 likes · 16 min read
How eBay Migrated 10 PB of HDFS Data Across Namespaces in Just 2 Hours
Tencent Cloud Developer
Tencent Cloud Developer
Dec 2, 2020 · Big Data

WeChat Pay Log System at Scale: Practices with Hermes

WeChat Pay’s Hermes‑based log system ingests trillions of entries daily, storing petabytes across a 200‑node HDFS cluster with four‑nine availability, while LSM‑style writes, separate inverted indexes and hot‑cold tiering cut memory, disk and cost by up to 70 % and keep 95 % of queries under five seconds.

HDFSHermesLog Analytics
0 likes · 7 min read
WeChat Pay Log System at Scale: Practices with Hermes
Practical DevOps Architecture
Practical DevOps Architecture
Nov 27, 2020 · Big Data

Step-by-Step Guide to Install and Configure a Hadoop 2.8.2 Cluster

This tutorial provides a complete walkthrough for downloading Hadoop 2.8.2, setting up a three‑node master‑slave cluster, configuring core, HDFS, MapReduce and YARN settings, creating required directories, distributing the installation, starting the services, verifying the cluster status, and finally shutting it down.

Big DataCluster SetupHDFS
0 likes · 5 min read
Step-by-Step Guide to Install and Configure a Hadoop 2.8.2 Cluster
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 16, 2020 · Big Data

Comprehensive Overview of HDFS: Architecture, Advantages, Limitations, Commands, and Advanced Features

This article provides a detailed introduction to HDFS, covering its application scenarios, core architecture, fault‑tolerance benefits, drawbacks such as high latency and small‑file inefficiency, essential shell and API commands, cluster management procedures, and newer Hadoop 2.0 features like HA, Federation, snapshots, ACLs, and heterogeneous storage.

Big DataCLIHA
0 likes · 10 min read
Comprehensive Overview of HDFS: Architecture, Advantages, Limitations, Commands, and Advanced Features