How Loki Cuts Log Storage Costs While Integrating Deeply with Prometheus
This article explains Loki's origins, data model, LogQL query language, low‑cost storage design, and the full read‑write architecture—including Distributor, Ingester, Querier, and QueryFrontend—showing how it solves the shortcomings of traditional Elasticsearch‑based logging solutions and integrates tightly with Prometheus monitoring.
Introduction
Loki is an emerging log solution gaining attention because it addresses the high cost and complexity of traditional Elasticsearch‑based logging stacks. JD Cloud replaces ES with Loki for its cloud‑wing log service, aiming for deep integration with monitoring and extremely low storage overhead.
Why Loki Was Needed
During incident investigation, monitoring systems (e.g., Prometheus) detect abnormal metrics, but the metrics lack detailed context. Operators must switch to a separate log system, often ES+Kibana, which has different concepts, query syntax, and UI, increasing learning cost and slowing root‑cause analysis.
Full‑text indexing in ES also inflates storage because every log line is indexed, consuming both disk space and CPU during writes, which is wasteful for write‑heavy, read‑light log workloads.
Loki’s Goal
Loki aims to provide a log system that integrates tightly with monitoring while keeping costs minimal.
Data Model
Inspired by Prometheus, each log entry consists of labels , a timestamp , and content . Entries sharing identical labels belong to the same log stream :
{
"stream": {"label1":"value1","label2":"value2"},
"values": [
["<timestamp nanoseconds>","log content"],
["<timestamp nanoseconds>","log content"]
]
}Labels describe cluster, service, host, application, etc., and are used for both ingestion and querying. Loki also supports multi‑tenant environments where a log stream is defined per tenant.
LogQL Query Language
Loki uses LogQL, a PromQL‑like syntax that is simple and familiar to the community. Example:
{file="debug.log"} |= "err"This works like find + grep : the selector chooses streams, and the filter matches log lines.
Grafana Integration
Grafana includes a native Loki plugin, allowing side‑by‑side exploration of metrics and logs in a single UI, eliminating the need to switch between systems.
Low Storage Cost Design
Only label metadata is indexed; the raw log content is stored compressed in object storage (e.g., S3, GCS, Cassandra, BigTable) without any index. This reduces storage by an order of magnitude compared with full‑text indexing.
Overall Architecture
Loki follows a read‑write‑separate architecture composed of several modules:
Clients (Promtail, Fluent‑bit, Fluentd, Rsyslog) collect logs and forward them.
Distributor : entry point for writes; validates, hashes, and forwards logs to the appropriate Ingester, ensuring that logs of the same stream go to the same Ingester.
Ingester : buffers logs in memory, writes them to the underlying storage, and serves recent queries.
Querier : reads data from storage, applies LogQL filters, and returns results.
QueryFrontend : splits large queries into sub‑queries, schedules them across multiple Querier instances, and aggregates the responses.
Storage adapters for S3, Cassandra, BigTable, DynamoDB, etc.
Distributor Details
Distributor uses consistent hashing combined with a replica factor to decide which Ingester(s) receive a log. Each Ingester registers a set of random 32‑bit tokens in a hash ring; the hash of a log’s labels and tenant ID determines the target token.
Ingester Details
Ingester validates timestamps (must be monotonically increasing per stream) and stores logs in a hierarchical in‑memory structure: Instances → Streams → Chunks → Blocks → Entries. Chunks are appended sequentially; when a chunk’s size or age exceeds thresholds, a new chunk is created.
Chunk lifecycle states:
Writing
Waiting flush
Retain
Destroy
Chunks are eventually flushed to object storage as compressed blocks (gzip, snappy, lz4).
Storage Adapter
The adapter abstracts read/write operations for various back‑ends, presenting a uniform interface to the rest of the system.
Indexing
Loki indexes only label data, mapping label → log stream → chunk . The index is stored in tables with a composite hash key (tenant ID + bucket ID + label name) and a range key for the label value hash. Three index types support different query patterns (by tenant, by tenant + label, by stream).
Query Processing
LogQL queries are parsed, then the system retrieves the list of matching log‑stream IDs using the label index. It then fetches the corresponding chunk IDs, creates iterators (batch, stream, chunk, block, block‑bytes) to read log lines in order, and finally applies any content filters.
For large time ranges, QueryFrontend splits the request into fixed‑size sub‑queries (e.g., 15‑minute windows), feeds them into a buffered queue, and runs them concurrently across multiple Querier instances.
Concurrency Model
Each Querier maintains a gRPC bidirectional stream with QueryFrontend, pulling sub‑queries, executing them, and returning results. The Runner component controls the number of concurrent sub‑queries to avoid overwhelming the system.
Conclusion
Loki’s design—sequential write‑append, label‑only indexing, compressed object‑storage chunks, and a modular read‑write pipeline—delivers a cost‑effective, Prometheus‑compatible log solution that simplifies observability workflows.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
