Master Loki: Scalable Log Aggregation for Kubernetes and Prometheus
This guide introduces Loki, the open‑source, horizontally scalable log aggregation system optimized for Prometheus and Kubernetes, covering its core concepts, architecture, components, deployment steps, Grafana integration, label‑based indexing, and best practices for handling dynamic and high‑cardinality tags.
Preface
When designing the company's container‑cloud log solution, we found mainstream ELK/EFK stacks too heavy and many Elasticsearch search features unnecessary, so we chose Grafana's open‑source Loki log system.
Below we introduce basic concepts and architecture of Loki; of course, EFK remains a mature solution worth knowing.
Overview
Loki is the latest open‑source project from Grafana Labs, a horizontally scalable, highly available, multi‑tenant log aggregation system.
It is economical and easy to operate because it does not index log contents; instead it indexes each log stream with a set of labels, optimized for Prometheus and Kubernetes users.
The project is inspired by Prometheus: “Like Prometheus, but for logs.”
Project address: https://github.com/grafana/loki/ Compared with other log aggregation systems, Loki has the following features:
Does not perform full‑text indexing; stores compressed unstructured logs and only indexes metadata, making operation simpler and cheaper.
Uses the same label‑based indexing and grouping as Prometheus, improving scalability and allowing integration with Alertmanager.
Especially suitable for storing Kubernetes pod logs; pod labels are automatically indexed.
Native support in Grafana, avoiding switching between Kibana and Grafana.
Architecture
Components
Explanation:
Promtail acts as the collector, similar to Filebeat.
Loki serves as the backend, similar to Elasticsearch.
Loki processes consist of four roles:
Querier
Ingester
Query‑frontend
Distributor
The role can be selected via the -target parameter of the Loki binary.
Read path
Querier receives HTTP/1 data requests.
Querier forwards the query to all ingesters to read in‑memory data.
Ingester returns matching data (if any).
If no ingester returns data, the querier lazily loads from the back‑storage and queries it.
Querier deduplicates and returns the final dataset over the HTTP/1 connection.
Write path
Write flow:
Distributor receives an HTTP/1 request to store stream data.
Each stream is hashed using a ring hash.
Distributor sends each stream to the appropriate ingester and its replicas (based on the configured replication factor).
Each instance creates a block for the stream data or appends to an existing block; blocks are unique per tenant and label set.
Distributor responds with a success code over HTTP/1.
Deployment
Local mode installation
Download Promtail and Loki:
wget https://github.com/grafana/loki/releases/download/v2.2.1/loki-linux-amd64.zip
wget https://github.com/grafana/loki/releases/download/v2.2.1/promtail-linux-amd64.zipInstall Promtail
# create directories
mkdir -p /opt/app/{promtail,loki}
# promtail configuration file
cat <<EOF > /opt/app/promtail/promtail.yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /var/log/positions.yaml # writable by promtail
client:
url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: system
pipeline_stages:
static_configs:
- targets:
- localhost
labels:
job: varlogs
host: yourhost
__path__: /var/log/*.log
EOF
# unzip and install
unzip promtail-linux-amd64.zip
mv promtail-linux-amd64 /opt/app/promtail/promtail
# systemd service
cat <<EOF > /etc/systemd/system/promtail.service
[Unit]
Description=promtail server
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/opt/app/promtail/promtail -config.file=/opt/app/promtail/promtail.yaml
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=promtail
[Install]
WantedBy=default.target
EOF
systemctl daemon-reload
systemctl restart promtail
systemctl status promtailInstall Loki
# create directories
mkdir -p /opt/app/{promtail,loki}
# Loki configuration file
cat <<EOF > /opt/app/loki/loki.yaml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
ingester:
wal:
enabled: true
dir: /opt/app/loki/wal
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 1h
max_chunk_age: 1h
chunk_target_size: 1048576
chunk_retain_period: 30s
max_transfer_retries: 0
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /opt/app/loki/boltdb-shipper-active
cache_location: /opt/app/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /opt/app/loki/chunks
compactor:
working_directory: /opt/app/loki/boltdb-shipper-compactor
shared_store: filesystem
limits_config:
reject_old_samples: true
reject_old_samples_max_age: 168h
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
ruler:
storage:
type: local
local:
directory: /opt/app/loki/rules
rule_path: /opt/app/loki/rules-temp
alertmanager_url: http://localhost:9093
ring:
kvstore:
store: inmemory
enable_api: true
EOF
# unzip and install
unzip loki-linux-amd64.zip
mv loki-linux-amd64 /opt/app/loki/loki
# systemd service
cat <<EOF > /etc/systemd/system/loki.service
[Unit]
Description=loki server
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/opt/app/loki/loki -config.file=/opt/app/loki/loki.yaml
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=loki
[Install]
WantedBy=default.target
EOF
systemctl daemon-reload
systemctl restart l
oki
systemctl status lokiUsage
Configure Loki datasource in Grafana
In Grafana, add a new datasource of type Loki and set the URL to http://loki:3100, then save.
Open the Explore section to query logs, e.g.:
rate({job="message"} |= "kubelet" [1m])Only label indexing
Loki indexes only labels, not log contents. Example static label matching with Promtail configuration:
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: message
__path__: /var/log/messagesQuery with label selector: {job="syslog"}. Multiple jobs can be matched with regex, e.g., job=~"apache|syslog".
Dynamic tags and high cardinality
Dynamic tags have non‑fixed values; high‑cardinality tags have many possible values, which can create a large number of streams and affect Loki performance.
Example of extracting action and status_code from Apache access logs using a regex stage in Promtail:
regex:
expression: "^(?P<ip>\\S+) (?P<identd>\\S+) (?P<user>\\S+) \[(?P<timestamp>[\\w:/]+\\s[+\\-]\\d{4})\] \"(?P<action>\\S+)\\s?(?P<path>\\S+)?\\s?(?P<protocol>\\S+)?\" (?P<status_code>\\d{3}|-) (?P<size>\\d+|- )\\s?\"?(?P<referer>[^\"]*)\"?\\s?\"?(?P<useragent>[^\"]*)?\"?$"Each combination of action and status_code creates a separate stream.
High‑cardinality issue
Using a label such as ip can generate thousands of streams, which may overwhelm Loki.
Full‑text indexing problem
Full‑text indexes can be as large as the log data itself, requiring memory and making scaling difficult. Loki’s index is typically an order of magnitude smaller than the ingested logs.
Query acceleration without label fields
Example filter expression: {job="apache"} |= "11.11.11.11".
Sharding during query
Loki splits queries into smaller shards, opens each matching block, and searches for the IP.
Shard size and parallelism are configurable.
Deploy many query‑frontends to handle large volumes quickly.
Index mode comparison
Elasticsearch maintains a large index constantly.
Loki launches parallel shards at query time, reducing constant overhead.
Best practices
When log volume is low, add fewer labels to avoid extra chunk overhead.
Add labels only when needed; e.g., if chunk_target_size=1MB, consider adding a label when a chunk reaches 10 MB within max_chunk_age.
Logs should be ingested in time‑ascending order; Loki rejects old data for performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
