Building Precise Metrics & Alerts with Micrometer, Prometheus, and Grafana
This guide explains how to implement observability in cloud‑native Spring Boot applications by collecting business metrics with Micrometer, storing them in Prometheus, visualizing with Grafana, and configuring accurate alerts through custom annotations, ServiceMonitors, and Alibaba Cloud ARMS, while providing code examples and deployment steps.
Background
Observability has become essential as distributed architectures become mainstream. Modern systems need fast troubleshooting, and the three pillars of observability—metrics, tracing, and logging—are widely recognized.
Observability Concepts
Peter Bourgon’s 2017 article mapped observability problems to handling metrics, tracing, and logging. Cindy Sridharan later called these the three pillars. CNCF introduced the term to IT, linking it to cybernetics: stronger observability improves controllability.
Technology Stack
For our Spring Boot services we use Spring Boot Actuator (Micrometer) for metric collection, Prometheus for storage and query, Grafana for visualization, and Alibaba Cloud ARMS for precise alert routing.
Metric Collection with Micrometer
Micrometer integrates with Actuator and supports many Meter types such as Counter, Timer, Gauge, DistributionSummary, LongTaskTimer, FunctionCounter, FunctionTimer, and TimeGauge. Metrics follow the OpenMetrics text format.
# HELP http_requests_total The total number of HTTP requests.</code>
<code># TYPE http_requests_total counter</code>
<code>http_requests_total{method="post",code="200"} 1027</code>
<code>http_requests_total{method="post",code="400"} 3</code>
<code># A histogram example</code>
<code># HELP http_request_duration_seconds A histogram of the request duration.</code>
<code># TYPE http_request_duration_seconds histogram</code>
<code>http_request_duration_seconds_bucket{le="0.05"} 24054</code>
<code>...Custom Annotations
Micrometer provides @Timed and @Counted annotations to automatically record method‑level metrics.
@Timed(value = "biz.print", percentiles = {0.95, 0.99}, description = "metrics of print")</code>
<code>public String print(PrintData printData) {</code>
<code> // method body</code>
<code>}The @Timed annotation records count, sum, max, and configurable percentiles. Example recorded data:
biz_print_seconds_count{exception="none"} 4.0</code>
<code>biz_print_seconds_sum{exception="none"} 7.783213927</code>
<code>biz_print_seconds_max{exception="none"} 6.14639717</code>
<code>biz_print_seconds{exception="none",quantile="0.95"} 0.58720256</code>
<code>biz_print_seconds{exception="none",quantile="0.99"} 6.157238272The @Counted annotation focuses on call counts and optionally failure counts.
@Counted(value = "biz.print", recordFailuresOnly = true, description = "metrics of print")</code>
<code>public String print(PrintData printData) {</code>
<code> // method body</code>
<code>} biz_print_failure_total{class="com.xxx.print.service.impl.PrintServiceImpl",exception="NullPointerException",method="print",result="failure"} 4.0Prometheus Integration
Expose metrics via the Actuator endpoint /actuator/prometheus. Deploy a Service and a ServiceMonitor so Prometheus can scrape the metrics.
apiVersion: v1</code>
<code>kind: Service</code>
<code>metadata:</code>
<code> name: print-svc</code>
<code> labels:</code>
<code> monitor/metrics: ""</code>
<code>spec:</code>
<code> ports:</code>
<code> - name: custom-metrics</code>
<code> port: 8080</code>
<code> targetPort: 8080</code>
<code> protocol: TCP</code>
<code> selector:</code>
<code> app: print-test apiVersion: monitoring.coreos.com/v1</code>
<code>kind: ServiceMonitor</code>
<code>metadata:</code>
<code> name: metrics</code>
<code> labels:</code>
<code> app: metric-monitor</code>
<code>spec:</code>
<code> endpoints:</code>
<code> - interval: 15s</code>
<code> port: custom-metrics</code>
<code> path: "/manage/prometheusMetric"</code>
<code> selector:</code>
<code> matchLabels:</code>
<code> monitor/metrics: ""Prometheus pulls metrics at the defined interval and stores them in a time‑series database.
Grafana Dashboards
Configure Grafana to use Prometheus as a data source, then create dashboards that query metrics via PromQL and display them as graphs.
Multiple panels can be combined into a single monitoring dashboard.
Precise Alerting with ARMS
To route alerts to the responsible development team, we integrate Alibaba Cloud ARMS Alert Center. ARMS can ingest Prometheus data and send notifications to DingTalk robot webhooks.
Each team’s DingTalk webhook is added as a contact, and notification policies filter alerts by the team label, ensuring alerts reach the correct group.
Conclusion
Combining Micrometer, Prometheus, Grafana, and ARMS gives cloud‑native services real‑time observability and targeted alerts, reduces noise, and speeds up incident diagnosis and resolution.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
