Operations 17 min read

How to Centralize Logs from Dockerized Services Using Flume and Kafka

This article explains a practical architecture for aggregating logs from distributed Docker containers by employing Flume NG as a lightweight log collector, Kafka as a high‑throughput message bus, and custom sinks to store logs per service, module and day with low latency and minimal resource impact.

21CTO

May 16, 2016

How to Centralize Logs from Dockerized Services Using Flume and Kafka

Design Constraints and Requirements

Application Scenario

In a distributed environment that may host hundreds of servers, each log entry is smaller than 1 KB, never exceeding 50 KB, and the total daily log volume is under 500 GB.

Functional Requirements

Collect all service logs centrally.

Distinguish source and split logs by service, module and day granularity.

Non‑functional Requirements

Non‑intrusive: the collector runs as an independent process and its resource consumption is controllable.

Real‑time: end‑to‑end latency from log generation to centralized storage must be less than 4 seconds.

Persistence: retain logs for the most recent N days.

Loss tolerance: occasional loss is acceptable as long as the loss ratio stays below a defined threshold (e.g., 0.01%).

Ordering: strict ordering is not required.

Availability: the collector is an offline function with a target of three‑nines (99.9%) yearly uptime.

Implementation Architecture

The solution consists of three layers—Producer, Broker, and Consumer—connected through Apache Flume NG and Kafka.

Producer Layer

Each Docker container runs an independent Flume NG agent that tails the application log file (using tail -F) and forwards the log events to a Kafka sink. This design satisfies the non‑intrusive requirement because no code changes are needed inside the application.

FROM ${BASE_IMAGE}
MAINTAINER ${MAINTAINER}
ENV REFRESH_AT ${REFRESH_AT}
RUN mkdir -p /opt/${MODULE_NAME}
ADD ${PACKAGE_NAME} /opt/${MODULE_NAME}/
COPY service.supervisord.conf /etc/supervisord.conf.d/service.supervisord.conf
COPY supervisor-msoa-wrapper.sh /opt/${MODULE_NAME}/supervisor-msoa-wrapper.sh
RUN chmod +x /opt/${MODULE_NAME}/supervisor-msoa-wrapper.sh
RUN chmod +x /opt/${MODULE_NAME}/*.sh
EXPOSE
ENTRYPOINT ["/usr/bin/supervisord", "-c", "/etc/supervisord.conf"]

#!/bin/bash
function shutdown() {
    date
    echo "Shutting down Service"
    unset SERVICE_PID
    cd /opt/${MODULE_NAME}
    source stop.sh
}

cd /opt/${MODULE_NAME}
source stop.sh

echo "Starting Service"
source start.sh
export SERVICE_PID=$!

sleep 4
nohup /opt/apache-flume-1.6.0-bin/bin/flume-ng agent \
    --conf /opt/apache-flume-1.6.0-bin/conf \
    --conf-file /opt/apache-flume-1.6.0-bin/conf/logback-to-kafka.conf \
    --name a1 -Dflume.root.logger=INFO,console &

trap shutdown HUP INT QUIT ABRT KILL ALRM TERM TSTP

echo "Waiting for $SERVICE_PID"
wait $SERVICE_PID

Broker Layer

Multiple Flume NG clients publish logs to a Kafka cluster. Kafka provides high throughput, partitioned storage, and configurable replication (set to 2 in this design) to meet durability and performance goals.

# Create Kafka topic
bin/kafka-topics.sh --create \
    --zookeeper localhost:2181 \
    --replication-factor 2 \
    --partitions 4 \
    --topic keplerlog

Consumer Layer

A second Flume NG agent consumes the Kafka topic, buffers events in an in‑memory channel, and writes them to local files using a custom RollingByTypeAndDayFileSink. The sink extracts a module name from the event header (or from a prefix added by the source) and creates files named module.YYYYMMDD.

# Flume agent configuration for the consumer

a1.sources = r1
a1.sinks = k1
a1.channels = c1

a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.zookeeperConnect = localhost:2181
a1.sources.r1.topic = keplerlog
a1.sources.r1.batchSize = 5
a1.sources.r1.groupId = flume-collector
a1.sources.r1.kafka.consumer.timeout.ms = 800

# Custom sink that rolls files by type and day
a1.sinks.k1.type = com.baidu.unbiz.flume.sink.RollingByTypeAndDayFileSink
a1.sinks.k1.channel = c1
a1.sinks.k1.sink.directory = /home/work/data/kepler-log

# In‑memory channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000000
a1.channels.c1.transactionCapacity = 100

# Bindings
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

Practical Method

Container Configuration

The Docker image includes Flume binaries and a supervisord configuration to keep both the application and the Flume agent alive.

Flume Configuration Details

The producer uses a custom StaticLinePrefixExecSource that prefixes each log line with serviceName##$$##module so the consumer sink can split the line and route it to the appropriate file.

a1.sources.r1.type = com.baidu.unbiz.flume.sink.StaticLinePrefixExecSource
a1.sources.r1.command = tail -F /opt/MODULE_NAME/log/logback.log
a1.sources.r1.prefix = service1
a1.sources.r1.separator = ##$$##
a1.sources.r1.suffix = m1-ocean-1004.cp

Kafka sink configuration (producer side):

a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = keplerlog
a1.sinks.k1.brokerList = gzns-cm-201508c02n01.gzns:9092,gzns-cm-201508c02n02.gzns:9092
a1.sinks.k1.requiredAcks = 0
a1.sinks.k1.batchSize = 5

Key pitfalls discovered during implementation:

When the exec source wraps tail -F inside a script that adds extra processing (e.g., piping to awk), recent log lines may be dropped because the source discards output that arrives before the script finishes.

Flume’s default KafkaSource does not forward custom headers; the custom source must explicitly add them to the event.

Conclusion

By combining open‑source components such as Flume NG and Kafka, engineers can build a low‑latency, scalable log aggregation pipeline for containerized services. The same pipeline can be extended to downstream analytics (Spark Streaming, ELK stack, etc.) to enable real‑time monitoring, alerting, and data‑driven insights.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Operations kafka Flume distributed logging log-aggregation

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.