Cloud Native 12 min read

Building Full Observability for Spring Cloud Microservices with Micrometer, Prometheus, and Grafana

After solving distributed transactions with Seata, this tutorial shows how to add complete observability to Spring Cloud microservices by integrating Micrometer, Prometheus, and Grafana, covering metrics pillars, configuration, custom business metrics, dashboard setup, alert rules, validation steps, and common pitfalls.

Coder Trainee
Coder Trainee
Coder Trainee
Building Full Observability for Spring Cloud Microservices with Micrometer, Prometheus, and Grafana

Goal

Complete the observability stack for the Spring Cloud demo project using Micrometer, Prometheus, and Grafana.

Why Observability?

Metrics – system runtime data (Micrometer + Prometheus)

Logging – event records (ELK / Loki)

Tracing – request flow (SkyWalking / Jaeger)

Metrics Types

Counter : ever‑increasing count (e.g., total requests, total errors)

Gauge : value that can go up or down (e.g., active connections, memory usage)

Timer : request latency

DistributionSummary : size distribution (e.g., request payload)

Environment Preparation

Add Prometheus and Grafana services to docker‑compose.yml:

# docker-compose.yml addition
prometheus:
  image: prom/prometheus:latest
  container_name: prometheus-teaching
  ports:
    - "9090:9090"
  volumes:
    - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
    - "teaching-network"

grafana:
  image: grafana/grafana:latest
  container_name: grafana-teaching
  ports:
    - "3000:3000"
  environment:
    - GF_SECURITY_ADMIN_PASSWORD=admin
    - GF_INSTALL_PLUGINS=grafana-piechart-panel
  volumes:
    - grafana-data:/var/lib/grafana
  networks:
    - teaching-network

Prometheus scrape configuration ( prometheus.yml) defines three jobs for order-service, stock-service, and point-service exposing /actuator/prometheus on ports 8081‑8083.

Service Integration

Add the following Maven dependencies to each service:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<dependency>
  <groupId>io.micrometer</groupId>
  <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Enable the Prometheus endpoint in application.yml:

management:
  endpoints:
    web:
      exposure:
        include: health,info,prometheus,metrics
  metrics:
    export:
      prometheus:
        enabled: true
  info:
    app:
      name: ${spring.application.name}
      version: 1.0.0
      description: Spring Cloud teaching project

Verify the endpoint with curl http://localhost:8081/actuator/prometheus – you should see lines such as

# HELP http_server_requests_seconds Duration of HTTP server requests

.

Core Metrics Details

http_server_requests_seconds

– HTTP request latency (Timer) jvm_memory_used_bytes – JVM memory usage (Gauge) jvm_gc_pause_seconds – GC pause time (Timer) system_cpu_usage – CPU usage (Gauge) process_uptime_seconds – Process uptime (Gauge)

Custom Business Metrics

Define a component that registers counters, a timer, and a gauge for order processing:

package com.teaching.order.metrics;

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import org.springframework.stereotype.Component;
import java.util.concurrent.atomic.AtomicLong;

@Component
public class OrderMetrics {
    private final Counter orderCreateCounter;
    private final Counter orderSuccessCounter;
    private final Counter orderFailureCounter;
    private final Timer orderCreateTimer;
    private final AtomicLong pendingOrders;

    public OrderMetrics(MeterRegistry registry) {
        this.orderCreateCounter = Counter.builder("order.create.total")
            .description("订单创建总数")
            .register(registry);
        this.orderSuccessCounter = Counter.builder("order.create.success")
            .description("订单创建成功数")
            .register(registry);
        this.orderFailureCounter = Counter.builder("order.create.failure")
            .description("订单创建失败数")
            .register(registry);
        this.orderCreateTimer = Timer.builder("order.create.duration")
            .description("订单创建耗时")
            .publishPercentiles(0.5, 0.95, 0.99)
            .register(registry);
        this.pendingOrders = registry.gauge("order.pending.count", new AtomicLong(0));
    }

    public void recordCreate() { orderCreateCounter.increment(); }
    public void recordSuccess() { orderSuccessCounter.increment(); }
    public void recordFailure() { orderFailureCounter.increment(); }
    public <T> T recordTimer(java.util.concurrent.Callable<T> callable) throws Exception {
        return orderCreateTimer.recordCallable(callable);
    }
    public void setPendingOrders(long count) { pendingOrders.set(count); }
}

Using Metrics in Business Code

@Service
@RequiredArgsConstructor
@Slf4j
public class OrderService {
    private final OrderMetrics orderMetrics;

    @GlobalTransactional(name = "create-order", rollbackFor = Exception.class)
    public void createOrder(OrderCreateDTO request) {
        orderMetrics.recordCreate();
        try {
            orderMetrics.recordTimer(() -> {
                // business logic
                doCreateOrder(request);
                return null;
            });
            orderMetrics.recordSuccess();
        } catch (Exception e) {
            orderMetrics.recordFailure();
            throw e;
        }
    }
}

Grafana Dashboard Configuration

Open http://localhost:3000 and log in with admin/admin.

Navigate to Configuration → Data Sources → Add data source → Prometheus.

Set URL to http://prometheus:9090 and save.

Import the official Spring Boot dashboard (ID 12900) and select the Prometheus data source.

Alert Rules (Prometheus)

# prometheus/alerts.yml
groups:
- name: service_alerts
  rules:
  - alert: ServiceDown
    expr: up{job=~"order-service|stock-service|point-service"} == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Service {{ $labels.job }} is down"
      description: "Service {{ $labels.job }} has not responded for over 1 minute"
  - alert: HighErrorRate
    expr: |
      sum(rate(http_server_requests_seconds_count{status=~"5.."}[2m])) /
      sum(rate(http_server_requests_seconds_count[2m])) > 0.05
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.job }} error rate too high"
      description: "Error rate exceeds 5%"
  - alert: SlowResponse
    expr: |
      histogram_quantile(0.99, sum(rate(http_server_requests_seconds_bucket[5m])) by (le, job)) > 2
    for: 3m
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.job }} response too slow"
      description: "P99 latency exceeds 2 seconds"
  - alert: HighMemoryUsage
    expr: |
      (sum(jvm_memory_used_bytes{area="heap"}) / sum(jvm_memory_max_bytes{area="heap"})) > 0.85
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "{{ $labels.job }} JVM memory high"
      description: "Heap usage over 85%"

Verification of Observability

Start all services: docker-compose up -d.

Generate test traffic (100 POST requests to /api/order/create with a 0.1 s pause).

Query Prometheus for QPS and error rate, e.g.,

sum(rate(http_server_requests_seconds_count{application="order-service"}[1m]))

.

Open Grafana at http://localhost:3000 and check QPS trends, latency distribution, and JVM metrics.

Common Issues & Pitfalls

Pitfall 1: /actuator/prometheus returns 404

Cause : Prometheus endpoint not exposed.

Fix :

management:
  endpoints:
    web:
      exposure:
        include: prometheus,metrics

Pitfall 2: Prometheus cannot scrape targets

Check http://localhost:9090/targets – ensure targets are UP.

Verify container network connectivity.

Confirm the metrics path ( /actuator/prometheus) is correct.

Pitfall 3: Custom metrics not appearing

Make sure MeterRegistry is injected.

Confirm the custom metric methods are invoked.

Wait for the default 15 s scrape interval.

Next Episode Preview

Spring Cloud Microservices in Practice – Revised Edition (Part 10): Full Docker‑Compose Deployment, covering one‑click service startup, orchestration optimisation, environment isolation, and production‑grade configuration.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

observabilityMetricsPrometheusSpring CloudGrafanaDocker ComposeMicrometer
Coder Trainee
Written by

Coder Trainee

Experienced in Java and Python, we share and learn together. For submissions or collaborations, DM us.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.