Big Data 14 min read

Practical Experience with Druid SQL and Security at Meituan: Challenges, Improvements, and Best Practices

This article presents Meituan's real‑world experience with Apache Druid, detailing the platform’s current usage, the usability, security and stability challenges faced, the principles and architecture of Druid SQL, the enhancements made to schema inference, logging, query safety, and the custom security extensions implemented to achieve fine‑grained access control and SSO integration.

DataFunTalk
DataFunTalk
DataFunTalk
Practical Experience with Druid SQL and Security at Meituan: Challenges, Improvements, and Best Practices

Meituan has been using Apache Druid since 2016, operating two clusters with over 70 nodes, more than 500 tables, 100 TB of storage, and handling over 17 million queries per day, achieving sub‑second latency for the majority of workloads.

The deployment faced three main challenges: usability (high learning curve of JSON‑based queries), security (lack of authentication and authorization in early versions), and stability (handling schema changes and multi‑tenant troubleshooting).

Druid SQL, introduced in version 0.10, provides a translation layer from standard SQL to native JSON queries, offering HTTP and JDBC interfaces, automatic query type selection, and support for common patterns such as approximate TopN, semi‑joins, and nested GroupBy.

The architecture places the SQL layer inside the Broker, using Calcite for parsing and logical optimization, and a Server module for request handling. Improvements include optimizing schema inference by limiting the segment window, adding detailed request logging with unique sqlQueryId, and enforcing time‑range filters to prevent full‑table scans.

Security enhancements in versions 0.11/0.12 added TLS, extensible authentication (Basic, Kerberos) and role‑based access control. Meituan extended this with DB‑level access control, automated DB‑to‑DataSource mapping, and a custom SSO filter to support both authenticated and unauthenticated access.

Deployment guidelines recommend using Druid 0.13+, enabling basic‑security with an allow‑all fallback, initializing the permission DB, and gradually tightening permissions while following a specific node upgrade order.

The conclusion emphasizes that Druid SQL is a lightweight translation layer with minimal performance impact, that security features are mature but require careful configuration, and that large‑scale schema inference and broker startup times remain areas to monitor.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLplatformSecurityOLAPDruid
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.