Understanding Druid Metadata Management and Architecture
Apache Druid manages metadata through a layered, distributed system where the Overlord coordinates ingestion tasks, MiddleManagers launch Peons to create segments, Coordinators and Historical nodes store and serve segment data, Brokers route queries, while MySQL, Zookeeper, memory, and local files synchronize metadata for fault‑tolerant, high‑performance OLAP analytics.
Druid is a high‑performance data store designed for slice‑and‑dice and OLAP analysis on large data sets. It supports both offline and real‑time queries, making it a common choice for GUI analytics, business monitoring, and real‑time data warehouses.
The system features a multi‑process, distributed architecture where each component type can be configured and scaled independently, providing maximum flexibility for the cluster.
Metadata Concepts
Segment : The basic unit of data management in Druid. A datasource contains multiple segments, each representing data for a specific time interval. Segment payload (JSON) defines dimensions, metrics, and storage location.
Key segment attributes include:
Interval – start and end time of the data.
DataSource – the datasource to which the segment belongs.
Version – higher version segments supersede lower ones for the same interval.
Payload – dimension, metric, and deep‑storage information.
Datasource : Analogous to a table in a relational database. Its schema is dynamically derived from the union of all available segments. The schema is not stored in the metadata tables but is assembled at query time from segment metadata.
Rule : Defines segment retention policies (Load or Drop). Rules can be Forever, Interval, or Period based, and are evaluated in order. Example rule configuration:
[
{
"period": "P7D",
"includeFuture": true,
"tieredReplicants": {
"_default_tier": 1,
"vStream": 1
},
"type": "loadByPeriod"
},
{
"type": "dropForever"
}
]Task : Handles data ingestion (real‑time from Kafka or batch via Hadoop). A task generates one or more segments and includes dataSchema, tuningConfig, ioConfig, context, and datasource definitions.
Supervisor : Manages real‑time tasks; each datasource has a one‑to‑one supervisor created by the Overlord process.
Overall Architecture
The cluster can be visualized as a company:
Leadership layer (Overlord) receives ingestion requests and dispatches tasks.
Production layer (MiddleManager → Peon) creates segments.
Warehouse layer (Coordinator, Historical) stores and serves segments.
Sales layer (Broker) routes queries to the appropriate nodes.
Key components and their roles:
Overlord – task coordination, assignment, and status tracking.
MiddleManager – launches Peon processes for real‑time ingestion.
Coordinator – manages segment metadata, loading, and dropping.
Historical – loads and serves historical segments.
Broker – query entry point, builds timelines from segment metadata.
Metadata Storage Media
Druid stores metadata in multiple media to balance durability and performance:
MySQL (metadata database) – persistent storage of segment, task, and supervisor records.
Zookeeper – real‑time metadata transport, service discovery, and leader election.
Memory – caches metadata for fast access, synchronized from MySQL and Zookeeper.
Local files – used by Historical nodes for quick recovery after restart.
Synchronization mechanisms:
Overlord syncs real‑time task info from Zookeeper and periodically pulls active tasks from MySQL.
Coordinator periodically loads used segments from MySQL and watches Zookeeper for load‑queue updates.
Historical watches Zookeeper load‑queue, loads segments from deep storage, and registers them in Zookeeper.
MiddleManager watches Zookeeper task nodes, launches Peon processes, and reports segment info.
Broker watches Zookeeper announcements and segment nodes to build query timelines.
Business Logic
Data ingestion workflow: Overlord receives a task, writes it to MySQL, assigns it via Zookeeper, MiddleManager launches Peon, Peon writes segment metadata to MySQL, Coordinator pulls it, writes load instructions to Zookeeper, Historical loads the segment from deep storage, and finally registers the segment in Zookeeper for query serving.
Query workflow: Broker receives a client query, uses datasource and interval to locate relevant segments via timelines, fetches real‑time data from Peon and historical data from Historical, merges results, and returns them to the client.
The article concludes that Druid’s multi‑process, distributed design, combined with diverse metadata storage (memory, MySQL, Zookeeper, local files), provides high flexibility, fault tolerance, and efficient data ingestion and query processing.
vivo Internet Technology
Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.