Tagged articles
26 articles
Page 1 of 1
Ops Community
Ops Community
Sep 8, 2025 · Operations

Mastering Distributed Log Architecture: From Flume to ELK and Beyond

This comprehensive guide walks you through the challenges of large‑scale log collection, real‑time processing, storage optimization, and visualization, detailing practical configurations for Flume, Logstash, Elasticsearch, Kibana, Filebeat, Kafka, Kubernetes, and future AIOps integrations to build a reliable, cost‑effective distributed logging system.

ELKFlumeKafka
0 likes · 24 min read
Mastering Distributed Log Architecture: From Flume to ELK and Beyond
Architecture Digest
Architecture Digest
Oct 11, 2021 · Big Data

Core Technologies and Architecture of a Big Data Platform

This article explains the typical architecture of a big‑data platform, detailing its four core layers—data collection, storage & analysis, data sharing, and application—and describing the key technologies such as Flume, DataX, HDFS, Hive, Spark, Spark Streaming, and task scheduling components.

Big DataData ArchitectureDataX
0 likes · 8 min read
Core Technologies and Architecture of a Big Data Platform
Programmer DD
Programmer DD
Mar 28, 2021 · Big Data

Mastering Apache Flume: Architecture, Components, and Key Features

This article provides a comprehensive overview of Apache Flume, detailing its purpose as a distributed log aggregation system, explaining its core components such as sources, channels, and sinks, and illustrating its architecture, multi‑agent setups, and key features like reliability, scalability, compression, and monitoring.

Flumedata ingestionlog aggregation
0 likes · 9 min read
Mastering Apache Flume: Architecture, Components, and Key Features
Architect
Architect
Dec 23, 2020 · Operations

Design and Evaluation of Log Collection Agents: Flume vs Filebeat

This article analyses the shortcomings of traditional log‑collection agents, compares Flume and Filebeat based on low‑cost, stability, efficiency and lightweight criteria, and presents practical solutions for file discovery, offset tracking, multi‑line handling and performance tuning in modern logging pipelines.

Agent DesignFlumeObservability
0 likes · 13 min read
Design and Evaluation of Log Collection Agents: Flume vs Filebeat
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 8, 2020 · Big Data

Flume Tuning Guide for High‑Throughput Data Ingestion

This article explains how to identify and resolve performance bottlenecks in Apache Flume by configuring Taildir sources, optimizing channel capacities, tuning Kafka sinks, adjusting JVM options, and using simple monitoring scripts, enabling a single Flume‑NG agent to sustain over 50,000 RPS in production.

Big DataConfigurationFlume
0 likes · 10 min read
Flume Tuning Guide for High‑Throughput Data Ingestion
21CTO
21CTO
Oct 30, 2020 · Big Data

Which Log Collection System Wins? Scribe, Chukwa, Kafka, Flume & ELK Compared

This article reviews the background, requirements, and architectural designs of major open‑source log collection systems—including Facebook’s Scribe, Apache’s Chukwa, LinkedIn’s Kafka, Cloudera’s Flume—and evaluates mature monitoring tools such as ELK, highlighting their features, use cases, advantages, and drawbacks for large‑scale log processing.

Big DataELKFlume
0 likes · 18 min read
Which Log Collection System Wins? Scribe, Chukwa, Kafka, Flume & ELK Compared
Java Architect Essentials
Java Architect Essentials
Aug 21, 2020 · Big Data

Design and Integration of Flume, Kafka, Storm, Drools, and Redis for Real‑Time ETL Log Analysis

This article presents a modular architecture for real‑time ETL log analysis that combines Flume for log collection, Kafka as a buffering layer, Storm for stream processing, Drools for rule‑based data transformation, and Redis for fast storage, detailing installation, configuration, and code integration steps.

Big DataDroolsFlume
0 likes · 23 min read
Design and Integration of Flume, Kafka, Storm, Drools, and Redis for Real‑Time ETL Log Analysis
Youzan Coder
Youzan Coder
Mar 1, 2019 · Big Data

Flume Practice at YouZan: Data Collection and Pipeline Construction in Big Data Scenarios

YouZan’s experience with Flume shows how the at‑least‑once delivery model, combined with FileChannel storage and custom extensions such as an NsqSource, hourly‑based HdfsEventSink, metric reporting server, and timestamp interceptor, can reliably move MySQL binlog data to HDFS, while tuning transaction batch size and channel capacity boosts throughput and stability, paving the way for a unified management platform.

At-Least-OnceFlumeHDFS
0 likes · 11 min read
Flume Practice at YouZan: Data Collection and Pipeline Construction in Big Data Scenarios
Zhuanzhuan Tech
Zhuanzhuan Tech
Feb 26, 2019 · Cloud Native

Automated Business Log Collection in Zhaozhuan Container Cloud Platform Using Log‑Pilot

This article describes how Zhaozhuan built an automated, business‑transparent log‑collection solution for its container cloud platform by evaluating several approaches, adopting Alibaba Cloud's open‑source log‑pilot, customizing its deployment, and addressing practical issues such as time‑zone bugs, latency, and duplicate collection.

Cloud NativeContainerFluentd
0 likes · 13 min read
Automated Business Log Collection in Zhaozhuan Container Cloud Platform Using Log‑Pilot
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 31, 2018 · Big Data

Evolution of iQIYI Real-Time Big Data Collection System

iQIYI’s big‑data collection system has progressed from simple HTTP log uploads to a Flume‑Kafka pipeline and finally to a custom Venus‑Agent architecture with centralized configuration, persistent offsets, dual‑Kafka streams and Flink processing, now handling tens of millions of queries per second and over three hundred billion records daily to power its AI‑driven services.

Big DataFlinkFlume
0 likes · 15 min read
Evolution of iQIYI Real-Time Big Data Collection System
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Mar 24, 2017 · Operations

Evolution of Tongcheng Log System Architecture

The article chronicles the development of Tongcheng's centralized log system from early file‑based logging through a MongoDB‑based solution to the current multi‑layer architecture using Flume, Elasticsearch, and Hadoop, highlighting design decisions, challenges, and future improvement plans.

Big DataFlumelog system
0 likes · 7 min read
Evolution of Tongcheng Log System Architecture
Architecture Digest
Architecture Digest
May 22, 2016 · Big Data

Design and Architecture of Youzan Unified Log Platform

The article details the design, components, and operational challenges of Youzan's unified log platform, describing its multi‑layer architecture, ingestion methods using rsyslog/logstash and Flume‑NG, Kafka‑based log center, processing pipelines with Storm/Spark, and storage in HDFS and Elasticsearch.

Distributed SystemsFlumeKafka
0 likes · 10 min read
Design and Architecture of Youzan Unified Log Platform
21CTO
21CTO
May 16, 2016 · Operations

How to Centralize Logs from Dockerized Services Using Flume and Kafka

This article explains a practical architecture for aggregating logs from distributed Docker containers by employing Flume NG as a lightweight log collector, Kafka as a high‑throughput message bus, and custom sinks to store logs per service, module and day with low latency and minimal resource impact.

DockerFlumeKafka
0 likes · 17 min read
How to Centralize Logs from Dockerized Services Using Flume and Kafka
Architect
Architect
Feb 18, 2016 · Cloud Native

Collecting Docker Container Logs with Flume: Strategies and Implementation

This article explains how to capture Docker container logs, discusses the challenges of multi‑line log correlation, and presents two approaches—client‑side parsing and server‑side parsing—along with a concrete Flume customization using a DockerLog Java bean.

ContainerDockerFlume
0 likes · 7 min read
Collecting Docker Container Logs with Flume: Strategies and Implementation
21CTO
21CTO
Sep 27, 2015 · Big Data

How Weidian Built a Scalable Big Data Platform for Mobile Commerce

This article outlines the design and implementation of Weidian’s end‑to‑end big data processing platform, covering dataset definition, data collection via Flume‑based DataAgent, transmission through Databus, storage options such as HDFS, Kafka and Elasticsearch, and the monitoring and resource‑integration strategies that support massive mobile commerce logs.

ElasticsearchFlumeHadoop
0 likes · 18 min read
How Weidian Built a Scalable Big Data Platform for Mobile Commerce