Big Data 17 min read

Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee

Shopee’s engineering team scaled a 100‑node Apache Druid cluster for real‑time OLAP by redesigning the Coordinator load‑balancing algorithm, adding incremental metadata pulls, introducing a segment‑merged result cache, and building exact‑count and flexible sliding‑window operators, while planning cloud‑native deployment.

Shopee Tech Team

Jan 13, 2022

Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee

Apache Druid is a high‑performance open‑source time‑series database used for low‑latency, interactive analytical queries. This article shares the engineering practices and performance‑tuning techniques applied to support Shopee’s core OLAP real‑time analytics workloads.

1. Druid Cluster in Shopee – A single large‑scale cluster of over 100 physical nodes serves as the downstream store for various core business data pipelines. Batch and streaming jobs write segments into Druid, and downstream services perform real‑time OLAP queries.

2. Technical Optimizations

2.1 Coordinator load‑balancing algorithm

Problem: As the number of dataSources grew, the number of segments increased to millions, causing the Coordinator’s metadata management to become a bottleneck. A serial sub‑task that balances segment loading on historical nodes took more than 10 minutes, leading to segment hand‑off timeouts and query jitter.

Analysis: The reservoir‑sampling implementation sampled only one element per call from a 5‑million‑segment list, requiring 2 000 passes per balancing cycle.

Solution: Implement a batch‑sampling reservoir algorithm that traverses the segment list once and selects 2 000 elements, reducing the sub‑task execution time to ~300 ms.

Benchmark result – The batch‑sampling algorithm outperforms the original approach (see image in the original article).

2.2 Incremental metadata management

Problem: The Coordinator periodically pulls the entire segment table (≈5 million rows) from MySQL, which becomes slow and resource‑intensive as the cluster grows.

Solution: Introduce an incremental pull that only fetches newly added segments within a recent time window, merging them into the existing in‑memory snapshot. Full pulls are still performed at a longer interval for consistency.

Full‑pull SQL: SELECT payload FROM druid_segments WHERE used=true; Incremental‑pull SQL:

-- To ensure efficient execution, an index is created on the new filter condition<br/>SELECT payload FROM druid_segments WHERE used=true AND created_date > :created_date;

Configuration:

# Pull metadata added in the last 5 minutes<br/>druid.manager.segments.pollLatestPeriod=PT5M<br/># Full pull every 15 minutes<br/>druid.manager.segments.fullyPollDuration=PT15M

Result: Metadata pull and deserialization latency dropped dramatically, reducing pressure on the metadata DB and improving data visibility for users.

2.3 Broker result‑cache optimization

Problem: Existing cache mechanisms (segment‑level intermediate cache and result cache) have clear limitations. The result cache cannot be used with the group‑by v2 engine, and segment‑level cache adds serialization overhead on the Broker.

Solution: Implement a new "segment merged result cache" on the Broker that stores the merged intermediate results of historical segments. This cache works for all four usage scenarios (group‑by v2, historical only, hybrid, large‑scale segment results).

Configuration:

druid.broker.cache.useSegmentMergedResultCache=true<br/>druid.broker.cache.populateSegmentMergedResultCache=true

Benchmark shows roughly 50 % reduction in query latency after enabling the new cache.

3. Custom Feature Development

3.1 Integer‑exact deduplication operator

Business need: Precise order‑count and UV metrics, where existing approximate deduplication operators introduce error. Implemented a bitmap‑based exact count aggregator using RoaringBitmap for 32‑bit and 64‑bit integer fields.

Aggregator JSON API:

{
  "type": "Bitmap32ExactCountBuild or Bitmap32ExactCountMerge",
  "name": "exactCountMetric",
  "fieldName": "userId"
}

SQL usage:

SELECT "dim", Bitmap32_EXACT_COUNT("exactCountMetric") FROM "ds_name" WHERE "__time" >= CURRENT_TIMESTAMP - INTERVAL '1' DAY GROUP BY key

Limitations: High memory consumption for large data volumes; future work includes better compression and memory‑estimation techniques.

3.2 Flexible sliding‑window functions

Problem: Druid only supports fixed‑size windows; many use‑cases require trailing aggregates (e.g., 7‑day UV).

Solution: Added a generic "default" averager that can apply a sliding window to any native aggregator, including hyperUnique (HLL) and other complex types.

Averager JSON:

{
  "aggregations": [
    {
      "type": "hyperUnique",
      "name": "deltaDayUniqueUsers",
      "fieldName": "uniq_user"
    }
  ],
  "averagers": [
    {
      "name": "trailing7DayUniqueUsers",
      "fieldName": "deltaDayUniqueUsers",
      "type": "default",
      "buckets": 7
    }
  ]
}

SQL usage:

SELECT TIME_FLOOR(__time, 'PT1H'), dim, MA_TRAILING_AGGREGATE_DEFAULT(DS_HLL(user), 7) FROM ds_name WHERE __time >= '2021-06-27T00:00:00.000Z' AND __time < '2021-06-28T00:00:00.000Z' GROUP BY 1, 2

Both the bitmap operator and the sliding‑window averager have been contributed to the Apache Druid community via pull requests.

4. Future Architecture Evolution

The team is exploring cloud‑native deployment patterns for Druid to further improve stability, cost efficiency, and scalability.

References – Links to Apache Druid documentation, related pull requests, and issue discussions are provided in the original article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Cache real-time analytics OLAP Bitmap Index Apache Druid

Written by

Shopee Tech Team

How to innovate and solve technical challenges in diverse, complex overseas scenarios? The Shopee Tech Team will explore cutting‑edge technology concepts and applications with you.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.