How to Build a RocketMQ Monitoring System with Prometheus Exporter
This guide explains the design and implementation of RocketMQ‑Exporter, walks through setting up RocketMQ, compiling and running the exporter, configuring Prometheus to scrape its metrics, defining alert rules, and visualizing data with Grafana for a complete DevOps monitoring solution.
Overview
RocketMQ‑Exporter is an open‑source Prometheus exporter that periodically pulls runtime statistics from a RocketMQ cluster, converts them to the Prometheus exposition format, and serves them on an HTTP /metrics endpoint for scraping.
RocketMQ basics
RocketMQ is a distributed messaging and streaming platform. It consists of one or more broker nodes and client roles: producers publish messages to topics, while consumers (grouped into consumer groups) pull messages from the brokers. The platform provides internal statistics such as TPS, message size, offsets and latency.
Prometheus primer
Prometheus is a time‑series monitoring system that scrapes HTTP endpoints, stores metrics, and offers a powerful query language (PromQL) for analysis and alerting.
Exporter architecture
The exporter is built with Spring Boot and contains three core components:
MQAdminExt – wraps the RocketMQ client API to retrieve broker‑side statistics.
MetricService – transforms raw statistics into Prometheus‑compatible metric samples.
Collect – stores the formatted samples and exposes them via the /metrics HTTP endpoint.
Each component runs as a scheduled task that fetches data, processes it, and makes it available for Prometheus.
Provided metrics
The exporter defines the following metrics (all are gauge or counter types): rocketmq_broker_tps – messages produced per second per broker. rocketmq_broker_qps – messages consumed per second per broker. rocketmq_producer_tps – production rate per topic. rocketmq_producer_put_size – bytes produced per second per topic. rocketmq_producer_offset – current producer offset per topic. rocketmq_consumer_tps – consumption rate per consumer group. rocketmq_consumer_get_size – bytes consumed per second per consumer group. rocketmq_consumer_offset – current consumer offset per group. rocketmq_message_accumulation – backlog calculated as producer_offset - consumer_offset. rocketmq_group_get_latency_by_storetime – consumer latency (store‑time vs. get‑time).
Sample alert rules (PromQL)
# High/low producer TPS per cluster
sum(rocketmq_producer_tps) by (cluster) >= 10
sum(rocketmq_producer_tps) by (cluster) < 1
# High/low consumer TPS per cluster
sum(rocketmq_consumer_tps) by (cluster) >= 10
sum(rocketmq_consumer_tps) by (cluster) < 1
# Consumer latency > 1 s
rocketmq_group_get_latency_by_storetime > 1000
# Message backlog exceeds a dynamic threshold (example)
rocketmq_message_accumulation > 10000Usage example
Start RocketMQ – launch a NameServer and at least one Broker (refer to the official RocketMQ quick‑start guide).
Obtain the exporter source – clone the GitHub repository and build the JAR:
git clone https://github.com/apache/rocketmq-exporter
cd rocketmq-exporter
mvn clean installRun the exporter – the default JAR is rocketmq-exporter-0.0.1-SNAPSHOT.jar. Example command:
java -jar rocketmq-exporter-0.0.1-SNAPSHOT.jar \
--rocketmq.config.namesrvAddr="127.0.0.1:9876" \
--rocketmq.config.webTelemetryPath="/metrics" \
--server.port=5557Key configuration options: rocketmq.config.namesrvAddr (default 127.0.0.1:9876) – address of the RocketMQ NameServer. rocketmq.config.webTelemetryPath (default /metrics) – HTTP path where metrics are exposed. server.port (default 5557) – port of the exporter HTTP server.
Install and configure Prometheus – download Prometheus, extract it, and start with a custom listen address (e.g., :5555) so it does not clash with other services:
tar -xzf prometheus-2.7.0-rc.1.linux-amd64.tar.gz
cd prometheus-2.7.0-rc.1.linux-amd64
./prometheus --config.file=prometheus.yml --web.listen-address=:5555Add the exporter to prometheus.yml :
scrape_configs:
- job_name: 'rocketmq-exporter'
static_configs:
- targets: ['localhost:5557']Add alert rules – create a rule file (e.g., warn.rules) and reference it in prometheus.yml under rule_files. Example rule snippet:
groups:
- name: RocketMQAlerts
rules:
- alert: RocketMQClusterProduceHigh
expr: sum(rocketmq_producer_tps) by (cluster) >= 10
for: 3m
labels:
severity: warning
annotations:
summary: "Cluster send TPS too high"
description: "{{ $labels.cluster }} sending TPS is too high."
- alert: ConsumerFallingBehind
expr: (sum(rocketmq_producer_offset) by (topic) - on(topic) group_right sum(rocketmq_consumer_offset) by (group,topic))
- ignoring(group) group_left sum(avg_over_time(rocketmq_producer_tps[5m])) by (topic) * 5 * 60 > 0
for: 3m
labels:
severity: warning
annotations:
summary: "Consumer lag behind"
description: "Consumer {{ $labels.group }} on {{ $labels.topic }} is falling behind."Grafana dashboard (optional)
Grafana can be used to visualise the metrics collected by Prometheus. After installing Grafana, add a Prometheus data source pointing to the Prometheus instance and import the pre‑built RocketMQ dashboard available at:
https://grafana.com/dashboards/10477/revisions
The dashboard provides panels for TPS, backlog, latency and other key indicators.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
