Cloud Native 18 min read

Introduction, Architecture, Deployment and Usage of Grafana Loki Log Aggregation System

This article introduces Grafana Loki, an open‑source, horizontally scalable, highly available log aggregation system optimized for Kubernetes and Prometheus, covering its core concepts, architecture, component roles, deployment steps, configuration examples, and practical usage within Grafana.

Architecture Digest

May 20, 2022

Introduction, Architecture, Deployment and Usage of Grafana Loki Log Aggregation System

Preface

When designing a container‑cloud logging solution, the heavyweight nature of ELK/EFK and the limited need for complex Elasticsearch queries led to the selection of Grafana's open‑source Loki system.

The article also notes that familiarity with the mature EFK solution remains valuable.

Overview

Loki is a horizontally scalable, highly available, multi‑tenant log aggregation system from Grafana Labs.

It is cost‑effective and easy to operate because it indexes only log stream metadata (labels) rather than full log content, making it especially suited for Prometheus and Kubernetes users.

Inspired by Prometheus, Loki’s tagline is "Like Prometheus, but for logs."

Project address: https://github.com/grafana/loki/ Key features compared with other log systems include:

No full‑text indexing; stores compressed unstructured logs and indexes only metadata, reducing cost.

Uses the same label‑based indexing as Prometheus, enabling efficient grouping and alertmanager integration.

Optimized for Kubernetes pod logs; pod labels are automatically indexed.

Native Grafana support, avoiding the need to switch between Kibana and Grafana.

Architecture

Diagram omitted (refer to original images).

Component Description

Key components:

Promtail – the collector, analogous to Filebeat.

Loki – the server side, analogous to Elasticsearch.

Loki processes run in four roles:

Querier – query engine.

Ingesters – log storage.

Query‑frontend – front‑end query handler.

Distributor – write dispatcher.

The role can be set via the -target flag on the Loki binary.

Read Path

Querier receives HTTP/1 requests.

Querier forwards the query to all ingesters to retrieve in‑memory data.

Ingesters return matching data, if any.

If no ingester returns data, the querier loads data from the back‑end store.

The querier deduplicates and streams the final dataset back over HTTP/1.

Write Path

Diagram omitted.

Distributor receives an HTTP/1 request to store stream data.

Each stream is hashed using a ring hash.

Distributor forwards each stream to the appropriate ingester and its replicas.

Each instance creates or appends a block for the stream; blocks are unique per tenant and label set.

Distributor responds with a success code over HTTP/1.

Deployment

Local Mode Installation

Download Promtail and Loki:

wget https://github.com/grafana/loki/releases/download/v2.2.1/loki-linux-amd64.zip
wget https://github.com/grafana/loki/releases/download/v2.2.1/promtail-linux-amd64.zip

Install Promtail

$ mkdir /opt/app/{promtail,loki} -pv

# promtail configuration file
$ cat <<EOF /opt/app/promtail/promtail.yaml
server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /var/log/positions.yaml # This location needs to be writable by promtail.

client:
  url: http://localhost:3100/loki/api/v1/push

scrape_configs:
- job_name: system
  pipeline_stages:
  static_configs:
  - targets:
    - localhost
    labels:
      job: varlogs
      host: yourhost
      __path__: /var/log/*.
EOF

# unzip and install
unzip promtail-linux-amd64.zip
mv promtail-linux-amd64 /opt/app/promtail/promtail

# service file
$ cat <<EOF >/etc/systemd/system/promtail.service
[Unit]
Description=promtail server
Wants=network-online.target
After=network-online.target

[Service]
ExecStart=/opt/app/promtail/promtail -config.file=/opt/app/promtail/promtail.yaml
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=promtail
[Install]
WantedBy=default.target
EOF

systemctl daemon-reload
systemctl restart promtail
systemctl status promtail

Install Loki:

$ mkdir /opt/app/{promtail,loki} -pv

# Loki configuration file
$ cat <<EOF /opt/app/loki/loki.yaml
auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

ingester:
  wal:
    enabled: true
    dir: /opt/app/loki/wal
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
    replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h
  max_chunk_age: 1h
  chunk_target_size: 1048576
  chunk_retain_period: 30s
  max_transfer_retries: 0

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /opt/app/loki/boltdb-shipper-active
    cache_location: /opt/app/loki/boltdb-shipper-cache
    cache_ttl: 24h
    shared_store: filesystem
  filesystem:
    directory: /opt/app/loki/chunks

compactor:
  working_directory: /opt/app/loki/boltdb-shipper-compactor
  shared_store: filesystem

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h

chunk_store_config:
  max_look_back_period: 0s

table_manager:
  retention_deletes_enabled: false
  retention_period: 0s

ruler:
  storage:
    type: local
    local:
      directory: /opt/app/loki/rules
  rule_path: /opt/app/loki/rules-temp
  alertmanager_url: http://localhost:9093
  ring:
    kvstore:
      store: inmemory
  enable_api: true
EOF

unzip loki-linux-amd64.zip
mv loki-linux-amd64 /opt/app/loki/loki

# service file
$ cat <<EOF >/etc/systemd/system/loki.service
[Unit]
Description=loki server
Wants=network-online.target
After=network-online.target

[Service]
ExecStart=/opt/app/loki/loki -config.file=/opt/app/loki/loki.yaml
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=loki
[Install]
WantedBy=default.target
EOF

systemctl daemon-reload
systemctl restart loki
systemctl status loki

Usage

Configure Loki datasource in Grafana

In Grafana, add a new datasource of type Loki and set the URL to http://loki:3100, then save.

After saving, switch to the Explore section to access Loki logs.

Click “Log labels” to view collected log labels and filter queries accordingly.

Example query to view /var/log/messages logs (adjust time zone if needed).

Query in Grafana Explore

rate({job="message"} |= "kubelet"

Equivalent to QPS calculation:

rate({job="message"} |= "kubelet" [1m])

Index‑only Labels

Loki indexes only labels, not full log content, which reduces index size dramatically compared to Elasticsearch.

Static Label Matching Example

scrape_configs:
- job_name: system
  pipeline_stages:
  static_configs:
  - targets:
    - localhost
    labels:
      job: message
      __path__: /var/log/messages

This creates a fixed label job="message" for logs under /var/log/messages.

Dynamic Labels and High Cardinality

Dynamic label values (e.g., IP address) can cause high cardinality, leading to many streams and potential performance issues.

Example regex stage to extract action and status_code from Apache access logs:

regex:
  expression: "^(?P<ip>\S+) (?P<identd>\S+) (?P<user>\S+) \[(?P<timestamp>[\w:/]+\s[+\-]\d{4})\] \"(?P<action>\S+)\s?(?P<path>\S+)?\s?(?P<protocol>\S+)?\" (?P<status_code>\d{3}|-) (?P<size>\d+|-)\s?\"?(?P<referer>[^\"]*)\"?\s?\"?(?P<useragent>[^\"]*)?\"?$"
labels:
  action:
  status_code:

Each unique combination of extracted labels creates a separate stream and chunk.

High‑Cardinality Problem

Using a high‑cardinality label such as IP can generate thousands of streams, which may overwhelm Loki.

Full‑Text Index Issue

Full‑text indexes can be as large as the log data itself, requiring significant memory and making scaling difficult. Loki’s index is typically an order of magnitude smaller.

Query Acceleration without Labels

{job="apache"} |= "11.11.11.11"

Shard‑Based Query Execution

Loki splits queries into smaller shards, opens matching chunks per stream, and searches them in parallel.

Shard size and parallelism are configurable based on resources.

Deploying many query‑frontends can process large volumes of logs quickly.

Index Mode Comparison

Elasticsearch maintains a large index constantly, consuming memory.

Loki builds temporary shards during query time, reducing constant overhead.

Best Practices

When log volume is low, add fewer labels to reduce chunk loading.

Add labels only when needed, e.g., when chunk_target_size=1MB and log volume justifies it.

Ensure logs are ingested in time‑order; Loki rejects old data for performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Observability Kubernetes Grafana Loki log aggregation Promtail

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.