Tag

flume

1 views collected around this technical thread.

Architecture Digest
Architecture Digest
Oct 11, 2021 · Big Data

Core Technologies and Architecture of a Big Data Platform

This article explains the typical architecture of a big‑data platform, detailing its four core layers—data collection, storage & analysis, data sharing, and application—and describing the key technologies such as Flume, DataX, HDFS, Hive, Spark, Spark Streaming, and task scheduling components.

Data ingestionDataXHadoop
0 likes · 8 min read
Core Technologies and Architecture of a Big Data Platform
Architect
Architect
Dec 23, 2020 · Operations

Design and Evaluation of Log Collection Agents: Flume vs Filebeat

This article analyses the shortcomings of traditional log‑collection agents, compares Flume and Filebeat based on low‑cost, stability, efficiency and lightweight criteria, and presents practical solutions for file discovery, offset tracking, multi‑line handling and performance tuning in modern logging pipelines.

Agent DesignFilebeatflume
0 likes · 13 min read
Design and Evaluation of Log Collection Agents: Flume vs Filebeat
Java Architect Essentials
Java Architect Essentials
Aug 21, 2020 · Big Data

Design and Integration of Flume, Kafka, Storm, Drools, and Redis for Real‑Time ETL Log Analysis

This article presents a modular architecture for real‑time ETL log analysis that combines Flume for log collection, Kafka as a buffering layer, Storm for stream processing, Drools for rule‑based data transformation, and Redis for fast storage, detailing installation, configuration, and code integration steps.

KafkaReal-time ProcessingRedis
0 likes · 23 min read
Design and Integration of Flume, Kafka, Storm, Drools, and Redis for Real‑Time ETL Log Analysis
Youzan Coder
Youzan Coder
Mar 1, 2019 · Big Data

Flume Practice at YouZan: Data Collection and Pipeline Construction in Big Data Scenarios

YouZan’s experience with Flume shows how the at‑least‑once delivery model, combined with FileChannel storage and custom extensions such as an NsqSource, hourly‑based HdfsEventSink, metric reporting server, and timestamp interceptor, can reliably move MySQL binlog data to HDFS, while tuning transaction batch size and channel capacity boosts throughput and stability, paving the way for a unified management platform.

HDFSNSQPerformance Tuning
0 likes · 11 min read
Flume Practice at YouZan: Data Collection and Pipeline Construction in Big Data Scenarios
Zhuanzhuan Tech
Zhuanzhuan Tech
Feb 26, 2019 · Cloud Native

Automated Business Log Collection in Zhaozhuan Container Cloud Platform Using Log‑Pilot

This article describes how Zhaozhuan built an automated, business‑transparent log‑collection solution for its container cloud platform by evaluating several approaches, adopting Alibaba Cloud's open‑source log‑pilot, customizing its deployment, and addressing practical issues such as time‑zone bugs, latency, and duplicate collection.

ContainerFluentdcloud-native
0 likes · 13 min read
Automated Business Log Collection in Zhaozhuan Container Cloud Platform Using Log‑Pilot
iQIYI Technical Product Team
iQIYI Technical Product Team
Jan 31, 2018 · Big Data

Evolution of iQIYI Real-Time Big Data Collection System

iQIYI’s big‑data collection system has progressed from simple HTTP log uploads to a Flume‑Kafka pipeline and finally to a custom Venus‑Agent architecture with centralized configuration, persistent offsets, dual‑Kafka streams and Flink processing, now handling tens of millions of queries per second and over three hundred billion records daily to power its AI‑driven services.

FlinkKafkaVenus-Agent
0 likes · 15 min read
Evolution of iQIYI Real-Time Big Data Collection System
Architecture Digest
Architecture Digest
Sep 7, 2017 · Big Data

Design and Implementation of Bilibili's Lancer Log Collection System

The article presents the architecture, component design, optimizations, and reliability guarantees of Bilibili's Lancer log collection system, a Flume‑based distributed pipeline that handles both real‑time and offline data streams for billions of events daily.

Kafkabig datadata pipeline
0 likes · 13 min read
Design and Implementation of Bilibili's Lancer Log Collection System
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Mar 24, 2017 · Operations

Evolution of Tongcheng Log System Architecture

The article chronicles the development of Tongcheng's centralized log system from early file‑based logging through a MongoDB‑based solution to the current multi‑layer architecture using Flume, Elasticsearch, and Hadoop, highlighting design decisions, challenges, and future improvement plans.

ElasticsearchHadooparchitecture
0 likes · 7 min read
Evolution of Tongcheng Log System Architecture
Architecture Digest
Architecture Digest
Jul 26, 2016 · Big Data

Real-Time Order Analytics System Architecture Using Flume, Kafka, Storm, and Redis

This article introduces a beginner-friendly architecture for real-time order analytics in a big‑data environment, detailing how Flume collects logs, Kafka buffers them, Storm processes streams, and Redis stores results, while also covering configuration, code snippets, deployment steps, and troubleshooting tips.

KafkaRedisbig data
0 likes · 26 min read
Real-Time Order Analytics System Architecture Using Flume, Kafka, Storm, and Redis
Architecture Digest
Architecture Digest
May 22, 2016 · Big Data

Design and Architecture of Youzan Unified Log Platform

The article details the design, components, and operational challenges of Youzan's unified log platform, describing its multi‑layer architecture, ingestion methods using rsyslog/logstash and Flume‑NG, Kafka‑based log center, processing pipelines with Storm/Spark, and storage in HDFS and Elasticsearch.

Kafkabig datadistributed systems
0 likes · 10 min read
Design and Architecture of Youzan Unified Log Platform
Architect
Architect
May 16, 2016 · Operations

Centralized Log Collection for Distributed Docker Services Using Flume and Kafka

This article presents a practical architecture for centrally collecting dispersed logs from Docker‑based services in a distributed environment by leveraging Flume NG as a non‑intrusive log agent, Kafka as a high‑throughput message bus, and custom sinks to partition logs by service, module, and day.

DockerKafkadistributed systems
0 likes · 15 min read
Centralized Log Collection for Distributed Docker Services Using Flume and Kafka
Architect
Architect
Apr 28, 2016 · Big Data

Design and Architecture of Youzan Unified Log Platform

The article describes the design, components, and implementation details of Youzan's unified log platform, covering log ingestion via rsyslog, Logstash, and Flume, centralized processing with Kafka, real‑time analysis using Storm/Spark, and storage in HDFS, Elasticsearch, and Hawk, while also discussing challenges and future improvements.

ElasticsearchHDFSKafka
0 likes · 10 min read
Design and Architecture of Youzan Unified Log Platform
Architect
Architect
Apr 10, 2016 · Big Data

Introduction to Flume NG: Architecture, Components, Configuration, and Best Practices

This article provides a comprehensive overview of Flume NG, covering its architecture, core components (source, channel, sink), reliability mechanisms, common deployment scenarios, installation steps, configuration examples, compilation instructions, and practical best‑practice recommendations for building robust log‑collection pipelines.

ApacheData ingestionbig data
0 likes · 16 min read
Introduction to Flume NG: Architecture, Components, Configuration, and Best Practices
Architect
Architect
Feb 18, 2016 · Cloud Native

Collecting Docker Container Logs with Flume: Strategies and Implementation

This article explains how to capture Docker container logs, discusses the challenges of multi‑line log correlation, and presents two approaches—client‑side parsing and server‑side parsing—along with a concrete Flume customization using a DockerLog Java bean.

ContainerDockerJava
0 likes · 7 min read
Collecting Docker Container Logs with Flume: Strategies and Implementation