Understanding Filebeat Harvester, Prospector, and Configuration for System Log Collection
This article explains how Filebeat’s harvester and prospector components read and forward system logs, maintain file offsets in a registry, and provides a sample YAML configuration for collecting logs from a specified file and sending them to Elasticsearch, illustrating key operational concepts for log management.
Filebeat can write syslog entries to a designated file (e.g., fifile.txt) and also parse logs in JSON format.
The harvester component reads each file line by line, forwards the content to the output, and is responsible for opening and closing the file.
The prospector (or explorer) manages harvesters and discovers all input sources. It currently supports two types—log and stdin—each of which can be defined multiple times. The prospector checks each file to decide whether to start a harvester or keep an existing one running, and can ignore files when appropriate.
Filebeat keeps the state of each file, frequently flushing it to a registry file on disk. This state records the last offset read by a harvester, ensuring that all log lines are sent. If the Elasticsearch or Logstash output becomes unreachable, Filebeat continues tracking the last sent offset and resumes reading when the output becomes available again. On restart, Filebeat reads the registry to rebuild state so each harvester resumes from its previous position; a separate state is maintained for each prospector‑discovered file, handling cases where files are deleted or moved.
# cat /etc/filebeat/filebeat.yml filebeat.inputs: - type: log enabled: true paths: - /var/log/fifile.log include_lines: ['^ERR', '^WARN', '^INFO'] output.elasticsearch: hosts: ["192.168.20.182:9200","192.168.20.181:9200","192.168.20.180:9200"] index: "system-%{[agent.version]}-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "system" setup.template.pattern: "system-*"
The article concludes with a list of recommended readings on ELK stack deployment, Nginx log collection, Zabbix agent deployment, and MySQL troubleshooting.
Practical DevOps Architecture
Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.