Databases 20 min read

Mastering Apache Druid: Architecture, Real-Time Ingestion, and Query Optimization

Apache Druid is a distributed, column‑store OLAP engine designed for massive real‑time data ingestion and sub‑second queries; this article explains its LSM‑tree‑inspired architecture, DataSource and Segment structures, memory‑based querying, practical deployment steps, common pitfalls, and optimization techniques for high‑throughput analytics.

Xingsheng Youxuan Technology Community
Xingsheng Youxuan Technology Community
Xingsheng Youxuan Technology Community
Mastering Apache Druid: Architecture, Real-Time Ingestion, and Query Optimization

1. Overview

Apache Druid was born in an advertising data analysis platform company. It is a distributed database system that supports massive real‑time analytics, suitable for large‑scale real‑time data ingestion and fast OLAP queries such as click‑stream analysis, server performance monitoring, application metrics, digital marketing, and real‑time transaction analysis. Druid has time‑series characteristics, low‑latency ingestion, pre‑aggregation by time granularity, and can query both historical and real‑time data with sub‑second latency.

2. Druid Architecture Analysis

To achieve real‑time data processing and ad‑hoc queries at scale, Druid balances efficient ingestion and fast query performance through a unique architecture based on DataSource and Segment structures and several specialized nodes.

2.1 Architecture Overview

Druid stores data in columnar format and follows a Lambda architecture that separates real‑time and batch processing. The main nodes are:

Realtime Node : ingests real‑time data and creates Segment files.

Historical Node : loads generated Segment files for query serving.

Broker Node : receives external queries, merges results from Realtime and Historical nodes, and returns them.

Coordinator Node : balances segment distribution among Historical nodes and manages data lifecycle.

Metastore : stores metadata such as Segment information, typically in MySQL or PostgreSQL.

Coordination Service : provides consistency coordination, usually Zookeeper.

DeepStorage : persistent storage for Segment files (local disk for single‑node, HDFS for clusters).

Data flows from realtime ingestion to DeepStorage, then Historical nodes download segments and serve queries.

2.2 LSM‑tree‑like Index Structure

Druid adopts a Log‑Structured Merge‑Tree (LSM‑tree) concept: incoming rows are first written to a commit log (WAL) and then to an in‑memory memtable. When thresholds are met, the memtable is flushed to disk as an immutable SSTable (Segment). This design enables high‑speed writes while supporting fast reads through bitmap indexes, though read performance can suffer if many SSTables must be scanned.

2.3 DataSource and Segment

A DataSource is analogous to a relational table. It contains a timestamp column, dimension columns (string type for filtering), and metric columns (numeric for aggregation). Data can be rolled up by dimension values or by time granularity, reducing storage and speeding up queries. Segments are the physical storage units; they are partitioned by time (segmentGranularity) and stored column‑wise with bitmap indexes for rapid filtering.

2.4 In‑Memory Querying

Historical nodes load required Segments into memory before serving queries. Cache mechanisms (external like Memcached or local node memory) further accelerate query response.

3. Druid Application Practice

In the early stages, the company “Xingsheng Youxuan” used Druid for real‑time order data in an OLAP scenario. The pipeline involved MySQL binlog → Canal → Kafka → Druid (Kafka‑index‑service). The following JSON is submitted via curl to start a streaming ingestion task:

curl -XPOST -H 'Content-Type:application/json' -d @stream_data.json http://localhost:8090/druid/indexer/v1/supervisor

Key operational tips and solutions include:

Adjust targetPartitionSize and partitionsSpec to avoid generating too many small Segments (ideal size 300 MB–700 MB).

Run compact tasks during low‑traffic periods to merge Segments.

For exact distinct counts, use thetaSketch type or perform group‑by counting.

Round floating‑point results from approximate calculations using a JavaScript post‑aggregator.

Store boolean dimensions as strings to avoid NULL values.

Configure QueryGranularity and SegmentGranularity appropriately; typically SegmentGranularity ≥ QueryGranularity.

Set intermediatePersistPeriod and maxRowsInMemory to balance memory usage and persistence frequency.

4. Summary

Druid’s architecture, inspired by LSM‑tree and read‑write separation, combined with DataSource and Segment structures, delivers high‑throughput real‑time ingestion and sub‑second OLAP queries for time‑series workloads. However, operating a full Druid cluster requires deep knowledge of its components (Broker, Historical, MiddleManager, Coordinator), careful JVM and thread‑pool tuning, and management of external dependencies such as DeepStorage, ZooKeeper, and metadata stores, leading to non‑trivial operational overhead.

real-time analyticsDistributed DatabaseOLAPApache DruidData ingestion
Xingsheng Youxuan Technology Community
Written by

Xingsheng Youxuan Technology Community

Xingsheng Youxuan Technology Official Account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.