Snowball Engineer Team
Mar 23, 2018 · Big Data
Redesigning Snowball's Log Collection Architecture During Hadoop Cluster Expansion
The article details Snowball's challenges with a saturated CDH Hadoop cluster, outlines the limitations of the original Kafka‑based log pipeline, and explains how a comprehensive redesign using FlumeNG, Spillable Memory Channels, and custom HDFS sinks resolves latency, data loss, and high‑load issues while supporting future growth.
Cluster MigrationData PipelineFlumeNG
0 likes · 6 min read