Understanding Filebeat Harvester, Prospector, and Configuration for System Log Collection
This article explains how Filebeat’s harvester and prospector components read and forward system logs, maintain file offsets in a registry, and provides a sample YAML configuration for collecting logs from a specified file and sending them to Elasticsearch, illustrating key operational concepts for log management.
Filebeat can write syslog entries to a designated file (e.g., fifile.txt) and also parse logs in JSON format.
The harvester component reads each file line by line, forwards the content to the output, and is responsible for opening and closing the file.
The prospector (or explorer) manages harvesters and discovers all input sources. It currently supports two types—log and stdin—each of which can be defined multiple times. The prospector checks each file to decide whether to start a harvester or keep an existing one running, and can ignore files when appropriate.
Filebeat keeps the state of each file, frequently flushing it to a registry file on disk. This state records the last offset read by a harvester, ensuring that all log lines are sent. If the Elasticsearch or Logstash output becomes unreachable, Filebeat continues tracking the last sent offset and resumes reading when the output becomes available again. On restart, Filebeat reads the registry to rebuild state so each harvester resumes from its previous position; a separate state is maintained for each prospector‑discovered file, handling cases where files are deleted or moved.
# cat /etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/fifile.log
include_lines: ['^ERR', '^WARN', '^INFO']
output.elasticsearch:
hosts: ["192.168.20.182:9200","192.168.20.181:9200","192.168.20.180:9200"]
index: "system-%{[agent.version]}-%{+yyyy.MM.dd}"
setup.ilm.enabled: false
setup.template.name: "system"
setup.template.pattern: "system-*"The article concludes with a list of recommended readings on ELK stack deployment, Nginx log collection, Zabbix agent deployment, and MySQL troubleshooting.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Practical DevOps Architecture
Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
