Operations 7 min read

Building a DTLE Monitoring System with Prometheus and Grafana (DTLE 3.21.07.0)

This tutorial walks through setting up a DTLE 3.21.07.0 monitoring environment by configuring DTLE and Nomad metrics, deploying Prometheus and Grafana via Docker, and creating common monitoring panels such as CPU, memory, bandwidth, latency, and TPS using PromQL.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Building a DTLE Monitoring System with Prometheus and Grafana (DTLE 3.21.07.0)

Background

Although DTLE documentation lists many monitoring items, users unfamiliar with Prometheus and Grafana often find the setup challenging; this article demonstrates how to build a complete DTLE monitoring system using DTLE 3.21.07.0.

1. Set Up DTLE Runtime Environment

A two‑node DTLE cluster is deployed (topology shown in the original image). When editing the DTLE configuration, two key points must be observed: enable publish_metrics and configure Nomad telemetry correctly. An example configuration for the node dtle‑src‑1 is provided below.

# DTLE 3.21.07.0 nomad upgraded to 1.1.2, add the following to enable monitoring
# Older DTLE versions do not need this configuration
telemetry {
  prometheus_metrics = true
  collection_interval = "15s"
}

plugin "dtle" {
  config {
    data_dir = "/opt/dtle/var/lib/nomad"
    nats_bind = "10.186.63.20:8193"
    nats_advertise = "10.186.63.20:8193"
    consul = "10.186.63.76:8500"
    api_addr = "10.186.63.20:8190"   # compatibility API
    nomad_addr = "10.186.63.20:4646" # compatibility API needs to reach a Nomad server
    publish_metrics = true
    stats_collection_interval = 15
  }
}

Two jobs are added to simulate data transfer between two MySQL instances (illustrated in the original diagram).

2. Deploy Prometheus

A prometheus.yml file is prepared to scrape both Nomad and DTLE metrics, with appropriate instance labels (recommended to use the DTLE server hostname). The service is then launched in Docker.

shell> cat /path/to/prometheus.yml
global:
  scrape_interval: 15s   # default is 1m
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'nomad'
    scrape_interval: 15s
    metrics_path: '/v1/metrics'
    params:
      format: ['prometheus']
    static_configs:
      - targets: ['10.186.63.20:4646']
        labels:
          instance: nomad-src-1
      - targets: ['10.186.63.76:4646']
        labels:
          instance: nomad-dest-1

  - job_name: 'dtle'
    scrape_interval: 15s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.186.63.20:8190']
        labels:
          instance: dtle-src-1
      - targets: ['10.186.63.76:8190']
        labels:
          instance: dtle-dest-1

Prometheus is started with:

shell> docker run -itd -p 9090:9090 --name=prometheus --hostname=prometheus --restart=always -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Access http://${prometheus_server_ip}:9090/targets in a browser to verify that the targets are being scraped.

3. Deploy Grafana

Grafana is also run in Docker:

shell> docker run -d --name=grafana -p 3000:3000 grafana/grafana

After logging in with the default credentials admin/admin , a Prometheus data source is added by providing its URL and testing the connection. Panels are then created; the article shows an example of a CPU‑usage panel.

4. Common Monitoring Items

Links to the full list of Nomad and DTLE metrics are provided. The article includes a table of frequently used PromQL expressions for CPU usage, memory consumption, bandwidth (source and destination), data latency, TPS, and buffer sizes, each with the corresponding unit.

5. Create Multiple Panels Simultaneously

Finally, several panels are combined on a Grafana dashboard to display all the selected metrics together, as illustrated by the concluding screenshot.

monitoringDockermetricsPrometheusGrafanaDTLEnomad
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.