Cloud Native 23 min read

Unlock Scalable Cloud‑Native Alerting with Grafana Mimir: Architecture, Components, and Setup

This article explains how Grafana Mimir extends Prometheus and Alertmanager to provide a horizontally scalable, highly available, multi‑tenant monitoring solution for Kubernetes, covering its architecture, key components, compression mechanisms, deployment steps, and configuration of Alertmanager and multi‑tenant support.

MaGe Linux Operations

Oct 10, 2022

Unlock Scalable Cloud‑Native Alerting with Grafana Mimir: Architecture, Components, and Setup

Cloud‑Native Alerting Landscape

In cloud‑native environments, Kubernetes is increasingly used in production, and monitoring stacks built with prometheus, exporter, grafana and alertmanager are common. However, alert rule editing traditionally relies on vim. Grafana Mimir combines Prometheus and Alertmanager to provide a visual, multi‑tenant alerting solution.

What Is Mimir?

Mimir offers horizontally scalable, highly available, multi‑tenant long‑term storage for Prometheus. Its architecture is illustrated below.

Metrics ingestion: Prometheus or compatible remote‑write clients send data to Mimir directly or via Grafana Agent.

Strong scalability: clusters grow by adding instances without manual sharding.

Grafana integration: users create alerts, rules and dashboards that query Mimir.

Mimir Components and Their Roles

Type

Component Name

Optional

alertmanager, ruler, overrides‑exporter, query‑scheduler

Required

compactor, distributor, ingester, querier, query‑frontend, store‑gateway

The following sections describe each component.

Compactor (stateless)

The compactor merges data blocks to improve query performance and reduce storage costs.

Compresses multiple tenant blocks into a single optimized block, shrinking indexes.

Keeps bucket indexes up‑to‑date using queriers, store‑gateway and rulers.

Deletes blocks that fall outside the configured retention period.

How It Works

Compaction runs at fixed intervals per tenant. Vertical compaction merges blocks written within a short time window (default 2 h), performing deduplication. Horizontal compaction then merges adjacent time‑range blocks, reducing the total number of blocks while preserving total size.

Scaling

Compaction concurrency is controlled by -compactor.compaction-concurrency. Tenant sharding is configured with -compactor.compactor-tenant-shard-size.

Compression Algorithm

Mimir uses a split‑and‑merge algorithm that overcomes TSDB index limits and avoids unbounded growth for large tenants. The process consists of a split stage (optional) and a merge stage.

Deletion Process

After successful compaction, original blocks are soft‑deleted (marked) and later hard‑deleted after a configurable delay, ensuring queries see the new compacted blocks before the old ones disappear.

Distributor (stateless)

The distributor receives time‑series from Prometheus or Grafana Agent, validates them, applies tenant limits, batches them, and forwards them to ingesters with configurable replication (default 3).

Validation

Metric metadata and labels follow the Prometheus exposition format.

Metadata length, label count, label name/value length, and sample timestamps are checked against limits such as -validation.max-metadata-length and -validation.max-label-names-per-series.

Rate Limiting

Two limits per tenant: request rate (max requests per second) and ingestion rate (max samples per second). Exceeding limits returns HTTP 429.

High‑Availability Tracker

When Prometheus HA pairs are configured, the distributor deduplicates incoming series to avoid double‑counting.

Sharding & Replication

Series are sharded and replicated across ingesters using a consistent hash ring. The replication factor is set via -ingester.ring.replication-factor (default 3).

Ingester (stateful)

Ingesters store incoming series in memory, periodically flushing them to long‑term storage (default every 2 h). They support write‑amplification reduction, replication, WAL, and region‑aware replication.

Querier (stateless)

Queriers evaluate PromQL expressions by reading recent data from ingesters and historic blocks from the store‑gateway. They maintain an up‑to‑date view of bucket metadata, either via bucket index download or bucket scanning.

Query‑Frontend (stateless)

The query‑frontend provides the same API as queriers, adds request queuing, query splitting (default 24 h intervals), and result caching (Memcached). It can align queries with step parameters using -query-frontend.align-queries-with-step=true at the cost of PromQL consistency.

Store‑Gateway (stateful)

The store‑gateway queries long‑term storage blocks for both queriers and rulers, using bucket index or scanning to keep its view current. It supports in‑memory and Memcached caching for metadata and block data.

Alertmanager (optional)

Mimir Alertmanager adds multi‑tenant support and horizontal scaling to Prometheus Alertmanager, deduplicating and routing alerts to email, PagerDuty, OpsGenie, etc.

Override‑Exporter (optional)

Exports per‑tenant limit metrics so operators can monitor resource usage.

Query‑Scheduler (optional)

Queues queries and distributes workload among available queriers.

Ruler (optional)

Evaluates recording and alerting rules defined in PromQL for each tenant.

Installation

Download the Mimir binary from the official site or deploy it in a Kubernetes cluster. The example below shows a non‑multi‑tenant configuration.

alertmanager:
  external_url: http://127.0.0.1:8080/alertmanager
  sharding_ring:
    replication_factor: 2
ingester:
  ring:
    replication_factor: 1
multitenancy_enabled: false
ruler:
  alertmanager_url: http://127.0.0.1:8080/alertmanager
  external_url: http://127.0.0.1:8080/ruler
  query_frontend:
    address: 127.0.0.1:9095
  query_stats_enabled: true
  rule_path: ./ruler/
ruler_storage:
  filesystem:
    dir: ./rules-storage
store_gateway:
  sharding_ring:
    replication_factor: 1
target: all,alertmanager,ruler

Start the service:

/usr/local/mimir/mimir-darwin-amd64 --config.file /usr/local/mimir/mimir.yaml

Viewing Status

Open the homepage in a browser to see service health.

Check running status, readiness, node list, and multi‑tenant view via the provided UI screenshots.

Configuring Alertmanager

Prepare Configuration File

global:
  resolve_timeout: 5m
  http_config:
    follow_redirects: true
    enable_http2: true
  smtp_from: [email protected]
  smtp_hello: mimir
  smtp_smarthost: smtp.qq.com:587
  smtp_auth_username: [email protected]
  smtp_require_tls: true
route:
  receiver: email
  group_by:
    - alertname
  continue: false
  routes:
    - receiver: email
      group_by:
        - alertname
      matchers:
        - severity="info"
      mute_time_intervals:
        - 夜间
      continue: true
  group_wait: 10s
  group_interval: 5s
  repeat_interval: 6h
inhibit_rules:
  - source_match:
      severity: warning
    target_match:
      severity: warning
    equal:
      - alertname
      - instance
receivers:
  - name: email
    email_configs:
      - send_resolved: true
        to: [email protected]
        from: [email protected]
        hello: mimir
        smarthost: smtp.qq.com:587
        auth_username: [email protected]
        headers:
          From: [email protected]
          Subject: '{{ template "email.default.subject" . }}'
          To: [email protected]
        html: '{{ template "email.default.html" . }}'
        text: '{{ template "email.default.html" . }}'
        require_tls: true
templates:
  - email.default.html
mute_time_intervals:
  - name: 夜间
    time_intervals:
      - times:
          - start_time: "00:00"
            end_time: "08:45"
          - start_time: "21:30"
            end_time: "23:59"

Upload Alertmanager Config to Mimir

mimirtool alertmanager load ./alertmanager.yaml --address http://127.0.0.1:8080 --id anonymous

Configure Grafana Alertmanager and Prometheus Data Sources

Follow the UI screenshots to add the Alertmanager endpoint and Prometheus (Mimir) data source.

Add Alert Rules

Use Grafana’s UI to create alerting rules that reference the uploaded Alertmanager configuration.

Configuring Multi‑Tenant Mode

Set multitenancy_enabled: true in the Mimir config file.

Upload an Alertmanager config for each tenant (instance_id can be the node name).

Load the config with

mimirtool alertmanager load ./alertmanager.yaml --address http://127.0.0.1:8080 --id instance_id

Kubernetes Prometheus Cloud Native Monitoring Alertmanager Grafana Mimir

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.