Step-by-Step Guide to Installing and Configuring Apache Flume on a Cluster
This guide walks through downloading Apache Flume, setting up a master‑slave cluster, and configuring NetCat, Exec, and Avro sources with corresponding sinks and memory channels, including verification commands to ensure the agents run correctly.
1. Software download
wget http://mirror.bit.edu.cn/apache/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz tar zxvf apache-flume-1.6.0-bin.tar.gz2. Cluster environment
Master: 172.16.11.97 Slave1: 172.16.11.98 Slave2: 172.16.11.99
3. NetCat source configuration (conf/flume-netcat.conf)
vim conf/flume-netcat.conf # Name the components on this agentagent.sources = r1
agent.sinks = k1
agent.channels = c1 # Source configuration agent.sources.r1.type = netcat
agent.sources.r1.bind = 127.0.0.1
agent.sources.r1.port = 44444 # Sink configuration agent.sinks.k1.type = logger # Channel configuration agent.channels.c1.type = memory
agent.channels.c1.capacity = 1000
agent.channels.c1.transactionCapacity = 100 # Bind source and sink to the channel agent.sources.r1.channels = c1
agent.sinks.k1.channel = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-netcat.conf --name=agent -Dflume.root.logger=INFO,console telnet master 444444. Exec source configuration (conf/flume-exec.conf)
vim conf/flume-exec.conf # Name the components on this agentagent.sources = r1
agent.sinks = k1
agent.channels = c1 # Source configuration agent.sources.r1.type = exec
agent.sources.r1.command = tail -f /data/hadoop/flume/test.txt # Sink configuration agent.sinks.k1.type = logger # Channel configuration agent.channels.c1.type = memory
agent.channels.c1.capacity = 1000
agent.channels.c1.transactionCapacity = 100 # Bind source and sink to the channel agent.sources.r1.channels = c1
agent.sinks.k1.channel = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-exec.conf --name=agent -Dflume.root.logger=INFO,console while true; do echo `date` >> /data/hadoop/flume/test.txt ; sleep 1; done5. Avro source configuration (conf/flume-avro.conf)
vim conf/flume-avro.conf # Define a memory channelagent.channels.c1.type = memory # Define Avro source agent.sources.r1.type = avro
agent.sources.r1.bind = 127.0.0.1
agent.sources.r1.port = 44444
agent.sources.r1.channels = c1 # Define HDFS sink agent.sinks.k1.type = hdfs
agent.sinks.k1.channel = c1
agent.sinks.k1.hdfs.path = hdfs://master:9000/flume_data_pool
agent.sinks.k1.hdfs.filePrefix = events-
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.writeFormat = Text
agent.sinks.k1.hdfs.rollSize = 0
agent.sinks.k1.hdfs.rollCount = 600000
agent.sinks.k1.hdfs.rollInterval = 600 # Bind components agent.sources = r1
agent.sinks = k1
agent.channels = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-avro.conf --name=agent -Dflume.root.logger=DEBUG,console telnet master 44444Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Practical DevOps Architecture
Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
