Backend Development 21 min read

How Loki Cuts Log Storage Costs While Integrating Deeply with Prometheus

This article explains Loki's origins, data model, LogQL query language, low‑cost storage design, and the full read‑write architecture—including Distributor, Ingester, Querier, and QueryFrontend—showing how it solves the shortcomings of traditional Elasticsearch‑based logging solutions and integrates tightly with Prometheus monitoring.

JD Cloud Developers

Dec 17, 2020

How Loki Cuts Log Storage Costs While Integrating Deeply with Prometheus

Introduction

Loki is an emerging log solution gaining attention because it addresses the high cost and complexity of traditional Elasticsearch‑based logging stacks. JD Cloud replaces ES with Loki for its cloud‑wing log service, aiming for deep integration with monitoring and extremely low storage overhead.

Why Loki Was Needed

During incident investigation, monitoring systems (e.g., Prometheus) detect abnormal metrics, but the metrics lack detailed context. Operators must switch to a separate log system, often ES+Kibana, which has different concepts, query syntax, and UI, increasing learning cost and slowing root‑cause analysis.

Full‑text indexing in ES also inflates storage because every log line is indexed, consuming both disk space and CPU during writes, which is wasteful for write‑heavy, read‑light log workloads.

Loki’s Goal

Loki aims to provide a log system that integrates tightly with monitoring while keeping costs minimal.

Data Model

Inspired by Prometheus, each log entry consists of labels , a timestamp , and content . Entries sharing identical labels belong to the same log stream :

{
  "stream": {"label1":"value1","label2":"value2"},
  "values": [
    ["<timestamp nanoseconds>","log content"],
    ["<timestamp nanoseconds>","log content"]
  ]
}

Labels describe cluster, service, host, application, etc., and are used for both ingestion and querying. Loki also supports multi‑tenant environments where a log stream is defined per tenant.

LogQL Query Language

Loki uses LogQL, a PromQL‑like syntax that is simple and familiar to the community. Example:

{file="debug.log"} |= "err"

This works like find + grep : the selector chooses streams, and the filter matches log lines.

Grafana Integration

Grafana includes a native Loki plugin, allowing side‑by‑side exploration of metrics and logs in a single UI, eliminating the need to switch between systems.

Low Storage Cost Design

Only label metadata is indexed; the raw log content is stored compressed in object storage (e.g., S3, GCS, Cassandra, BigTable) without any index. This reduces storage by an order of magnitude compared with full‑text indexing.

Overall Architecture

Loki follows a read‑write‑separate architecture composed of several modules:

Clients (Promtail, Fluent‑bit, Fluentd, Rsyslog) collect logs and forward them.

Distributor : entry point for writes; validates, hashes, and forwards logs to the appropriate Ingester, ensuring that logs of the same stream go to the same Ingester.

Ingester : buffers logs in memory, writes them to the underlying storage, and serves recent queries.

Querier : reads data from storage, applies LogQL filters, and returns results.

QueryFrontend : splits large queries into sub‑queries, schedules them across multiple Querier instances, and aggregates the responses.

Storage adapters for S3, Cassandra, BigTable, DynamoDB, etc.

Distributor Details

Distributor uses consistent hashing combined with a replica factor to decide which Ingester(s) receive a log. Each Ingester registers a set of random 32‑bit tokens in a hash ring; the hash of a log’s labels and tenant ID determines the target token.

Ingester Details

Ingester validates timestamps (must be monotonically increasing per stream) and stores logs in a hierarchical in‑memory structure: Instances → Streams → Chunks → Blocks → Entries. Chunks are appended sequentially; when a chunk’s size or age exceeds thresholds, a new chunk is created.

Chunk lifecycle states:

Writing

Waiting flush

Retain

Destroy

Chunks are eventually flushed to object storage as compressed blocks (gzip, snappy, lz4).

Storage Adapter

The adapter abstracts read/write operations for various back‑ends, presenting a uniform interface to the rest of the system.

Indexing

Loki indexes only label data, mapping label → log stream → chunk . The index is stored in tables with a composite hash key (tenant ID + bucket ID + label name) and a range key for the label value hash. Three index types support different query patterns (by tenant, by tenant + label, by stream).

Query Processing

LogQL queries are parsed, then the system retrieves the list of matching log‑stream IDs using the label index. It then fetches the corresponding chunk IDs, creates iterators (batch, stream, chunk, block, block‑bytes) to read log lines in order, and finally applies any content filters.

For large time ranges, QueryFrontend splits the request into fixed‑size sub‑queries (e.g., 15‑minute windows), feeds them into a buffered queue, and runs them concurrently across multiple Querier instances.

Concurrency Model

Each Querier maintains a gRPC bidirectional stream with QueryFrontend, pulling sub‑queries, executing them, and returning results. The Runner component controls the number of concurrent sub‑queries to avoid overwhelming the system.

Conclusion

Loki’s design—sequential write‑append, label‑only indexing, compressed object‑storage chunks, and a modular read‑write pipeline—delivers a cost‑effective, Prometheus‑compatible log solution that simplifies observability workflows.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Observability Prometheus loki log-aggregation LogQL

Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.