How Grafana Mimir Transforms Cloud‑Native Monitoring and Alerting
This article explains how Grafana Mimir provides a scalable, highly‑available, multi‑tenant long‑term storage for Prometheus, details its architecture and core components such as compactor, distributor, ingester, querier, query‑frontend and store‑gateway, and shows step‑by‑step installation, status checking, and Alertmanager configuration for cloud‑native environments.
Cloud‑Native Alerting Landscape
In cloud‑native ecosystems, Kubernetes is widely adopted, and monitoring stacks typically combine prometheus, exporter, grafana and alertmanager. Editing alert rules directly with vim is cumbersome, so a visual alerting tool—Grafana Mimir—integrates with Prometheus and Alertmanager to store rules visually.
What Is Mimir?
Mimir offers horizontally scalable, highly available, multi‑tenant long‑term storage for Prometheus metrics. Its architecture is illustrated below.
Mimir Architecture Highlights
Metrics are ingested by Prometheus or compatible remote‑write clients and stored in Mimir.
The cluster scales automatically; adding new instances increases capacity without manual rebalancing.
Grafana provides a UI to query data, create recording rules, and define alerts across tenants, all linked to Grafana dashboards.
Key Mimir Components
The following optional and required components form a Mimir deployment.
Optional: alertmanager, ruler, overrides‑exporter, query‑scheduler
Required: compactor, distributor, ingester, querier, query‑frontend, store‑gateway
Compactor (Stateless Data Compressor)
The compactor merges blocks to improve query performance and reduce storage costs. It compresses tenant blocks, updates bucket indexes for queriers, store‑gateway and rulers, and deletes blocks outside the retention period. Compression proceeds in two stages—vertical (time‑range merging) and horizontal (adjacent range merging).
Scaling
Vertical scaling is controlled by -compactor.compaction-concurrency; horizontal scaling uses -compactor.compactor-tenant-shard-size to shard tenants across compactor instances.
Distributor (Data Distributor)
The distributor receives time‑series data from Prometheus or Grafana Agent, validates it against Prometheus format rules, enforces label and metadata limits, and forwards batches to multiple ingesters with a configurable replication factor (default 3). It also implements per‑tenant request and ingestion rate limiting.
Ingester (Data Receiver)
The ingester stores incoming series in memory, batches them, and periodically uploads them to long‑term storage. It supports replication, write‑ahead logs, and region‑aware replication to avoid data loss.
Querier (Query Engine)
The querier evaluates PromQL expressions, reads recent data from ingesters and historic blocks from the store‑gateway, and can operate with bucket index enabled (default) or disabled.
Query‑Frontend
The query‑frontend provides the same API as the querier, adds request queuing, query splitting (default 24‑hour intervals), and result caching (via Memcached) to improve performance and reduce OOM risk.
Store‑Gateway
The store‑gateway queries long‑term blocks on behalf of queriers and rulers, supporting bucket index download or bucket scanning, block sharding, replication, and caching (in‑memory or Memcached).
Alertmanager Integration
Mimir’s Alertmanager adds multi‑tenant support and horizontal scaling, deduplicates and groups alerts, and routes them to channels such as email, PagerDuty or OpsGenie.
Override‑Exporter
Exports tenant‑specific limit metrics so operators can monitor resource usage per tenant.
Query‑Scheduler (Optional)
Maintains a queue of pending queries and distributes workload among available queriers.
Ruler (Optional)
Evaluates recording and alerting rules defined in PromQL for each tenant, grouping them into namespaces.
Installation
Download the Mimir binary from the official site or deploy it directly in a Kubernetes cluster. Example bare‑metal deployment (single‑tenant mode):
target: all # optional components must be added explicitly
replication_factor: 3 # set to 1 for single‑node testingPrepare a configuration file (example snippet shown below).
alertmanager:
external_url: http://127.0.0.1:8080/alertmanager
sharding_ring:
replication_factor: 2
ingester:
ring:
replication_factor: 1
multitenancy_enabled: false
ruler:
alertmanager_url: http://127.0.0.1:8080/alertmanager
external_url: http://127.0.0.1:8080/ruler
query_frontend:
address: 127.0.0.1:9095
query_stats_enabled: true
rule_path: ./ruler/
ruler_storage:
filesystem:
dir: ./rules-storage
store_gateway:
sharding_ring:
replication_factor: 1
target: all,alertmanager,rulerStart the service:
/usr/local/mimir/mimir-darwin-amd64 --config.file /usr/local/mimir/mimir.yamlViewing Status
After startup, open a browser to view the Mimir UI dashboards (screenshots omitted for brevity).
Configuring Alertmanager
Prepare Configuration File
global:
resolve_timeout: 5m
http_config:
follow_redirects: true
enable_http2: true
smtp_from: [email protected]
smtp_hello: mimir
smtp_smarthost: smtp.qq.com:587
smtp_auth_username: [email protected]
smtp_require_tls: true
route:
receiver: email
group_by:
- alertname
continue: false
routes:
- receiver: email
group_by:
- alertname
matchers:
- severity="info"
mute_time_intervals:
- 夜间
continue: true
group_wait: 10s
group_interval: 5s
repeat_interval: 6h
inhibit_rules:
- source_match:
severity: warning
target_match:
severity: warning
equal:
- alertname
- instance
receivers:
- name: email
email_configs:
- send_resolved: true
to: [email protected]
from: [email protected]
hello: mimir
smarthost: smtp.qq.com:587
auth_username: [email protected]
headers:
From: [email protected]
Subject: '{{ template "email.default.subject" . }}'
To: [email protected]
html: '{{ template "email.default.html" . }}'
text: '{{ template "email.default.html" . }}'
require_tls: true
templates:
- email.default.html
mute_time_intervals:
- name: 夜间
time_intervals:
- times:
- start_time: "00:00"
end_time: "08:45"
- start_time: "21:30"
end_time: "23:59"Upload the file to Mimir:
$ mimirtool alertmanager load ./alertmanager.yaml --address http://127.0.0.1:8080 --id anonymousConfigure Grafana to use the Mimir Alertmanager and Prometheus data sources (screenshots omitted).
Enabling Multi‑Tenant Mode
Set multitenancy_enabled: true in the Mimir configuration.
Upload an Alertmanager configuration per tenant using mimirtool alertmanager load with a unique --id (typically the node name).
$ mimirtool alertmanager load ./alertmanager.yaml --address http://127.0.0.1:8080 --id instance_idProgrammer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
