Quickly Analyze Hadoop NameNode RPC with ELK and Grafana
This guide shows how to reduce excessive NameNode RPC calls caused by frequent HDFS directory listings and demonstrates a complete ELK pipeline—Filebeat, Kafka/Logstash, Elasticsearch, and Kibana—plus Grafana dashboards for real‑time monitoring of Hadoop RPC operations.
Frequent HDFS directory‑listing requests from a business workload caused a surge in NameNode RPC calls; optimizing the code to fetch the directory list every five minutes cut the RPC frequency by about 1.5 times, saving 20‑30 k RPCs per second.
Why Use ELK for NameNode RPC Monitoring
ELK (Filebeat → Kafka/Logstash → Elasticsearch → Kibana) is a popular distributed log‑collection stack. Filebeat acts as the lightweight shipper, Logstash processes and formats logs, Elasticsearch stores them, and Kibana visualizes the data. Kafka is preferred over Redis for high‑throughput, durable queuing.
Architecture Overview
Step 1 – Filebeat Configuration
filebeat.prospectors:
- input_type: log
paths:
- "/var/log/hadoop-hdfs/hdfs-audit.log"
harvester_buffer_size: 32768
scan_frequency: 1s
backoff: 10ms
processors:
- drop_fields:
fields: ["beat","beat.name","beat.hostname","beat.version","input_type","offset","@timestamp","type","source"]
output.logstash:
hosts: ["logstash-host:5044", "logstash-host:5045"]
loadbalance: true
worker: 4
bulk_max_size: 4096
xpack.monitoring:
enabled: true
elasticsearch:
hosts: ["https://es-host1:9200", "https://es-host2:9200"]
username: beats_system
password: beat@123Step 2 – Logstash Pipeline (Dissect & Format)
input {
beats { port => "5045" }
}
filter {
if "/user/if_ia_pro/output/test" in [message] {
dissect {
mapping => { "message" => "%{logd} %{drop} %{level} %{log-type}: %{?allowed}=%{&allowed} %{?ugi}=%{&ugi} (%{?authtype}) %{?ip}=/%{&ip} %{?cmd}=%{&cmd} %{}=/user/if_ia_pro/output/test/%{src2}/%{src3}/%{} %{?dst}=%{&dst} %{?perm}=%{&perm} %{?proto}=%{&proto}" }
}
add_field => { "srctable" => "/user/if_ia_pro/output/test/%{src2}/%{src3}" "logdate" => "%{logd} %{drop}" }
remove_field => ["message","src2","src3","logd","drop"]
}
# additional branches for other path patterns omitted for brevity
date { match => ["logdate","ISO8601"] target => "@times" remove_field => ["logdate"] }
}
output {
elasticsearch {
hosts => ["es-host:9200"]
index => "logstash-hdfs-audit-%{+YYYY.MM.dd}"
user => "elastic"
password => "password"
}
stdout { }
}Step 3 – Verify Data in Elasticsearch
After starting Filebeat and Logstash, an index named logstash-hdfs-audit-YYYY.MM.dd appears in Elasticsearch.
Step 4 – Grafana Dashboard Configuration
Connect Grafana to the Elasticsearch data source, then create dashboards to visualize:
Overall RPC connections per minute across the cluster.
Per‑minute operation counts for each HDFS path type.
Top HDFS paths with the highest “All” operation counts.
Ranking of operation types and paths within the “All” category.
Conclusion
Monitoring NameNode RPC traffic is essential for any large‑scale Hadoop deployment, whether on‑premises or in the cloud. By leveraging the ELK stack and Grafana, engineers can quickly collect, parse, and visualize audit logs, enabling timely identification and remediation of inefficient or overly frequent operations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
