How Suning Integrated ClickHouse into a Full‑Link Monitoring Platform for Real‑Time OLAP Insights
This article explains how Suning's big‑data team incorporated ClickHouse into their end‑to‑end monitoring ecosystem, detailing the architecture, trace‑ID propagation, slow‑query tracking, MergeTree health checks, replica delay analysis, and the role of Chproxy in delivering comprehensive observability for high‑performance OLAP workloads.
Introduction
ClickHouse is a high‑performance OLAP engine that excels in single‑table analysis and colocate joins, thanks to its columnar storage, LSM‑Tree engine, vectorized execution, asynchronous merge and mutation mechanisms, and MPP+SMP concurrency.
In Suning’s big‑data environment, ClickHouse is widely used for metrics and tag analysis, addressing high‑cardinality data, precise deduplication, interactive queries, wide‑table analysis, time‑series storage, and real‑time materialized views.
1. Suning Big Data Full‑Link Monitoring Platform
1.1 Platform Overview
The platform generates a unique serialId for each HTTP request, a traceId for each interaction with the metric management system, and an olapId for each OLAP call, forming a tree‑shaped request chain that enables complete traceability.
Report‑design backend creates a unique traceId and hierarchical spanId for each front‑end request, forwarding them to the metric system.
The metric system calculates a query‑weight priority, generates a new spanId, and forwards the IDs to the OLAP layer.
Spark RDD links queryId from Druid, PostgreSQL, and ClickHouse with Spark StageID and JobID.
OLAP receives the request, creates its own spanId, and propagates the IDs downstream.
The OLAP engine correlates traceId with execution plans, allowing timing analysis of each stage (e.g., linking bigQueryId to Druid, mapping PostgreSQL and ClickHouse via traceId).
1.2 Integrating ClickHouse into the Platform
ClickHouse monitoring covers nodes, shards, parent/child query relationships, per‑node latency, memory usage, CPU consumption, row counts, MergeTree status, query protocol (TCP/HTTP), and thread participation.
To propagate the traceId to ClickHouse’s query_id, a custom query_id is set and later mapped to the system‑generated ID, enabling end‑to‑end traceability.
ClickHouse-client --port 1***5 --time --format=Null --query="select count() from aggr_member" --query_id="suning20200706"
echo 'select count() from aggr***member' | curl 'http://localhost:8**3/?query_id=suning2020&query=' --data-binary @-2. ClickHouse Slow‑Query Monitoring
2.1 Real‑Time Slow‑Query Monitoring
The dashboard visualizes per‑query parent/child relationships, node‑level scripts, execution time, rows read, memory peaks, thread count, user, profile events, and settings.
Top‑N charts display slow queries, memory usage, CPU usage, and MergeTree latency, with alerting for thresholds.
2.2 Historical Slow‑Query Monitoring
Enable query_log by adding <log_queries>1</log_queries> to /etc/clickhouse-server/users.xml. Each query creates one or two records depending on success or error:
Success: event types 1 and 2.
Runtime error: event types 1 and 4.
Pre‑execution error: event type 3.
Records store parent/child query_id relationships, node scripts, start time, duration, rows, result size, memory usage, thread count, stack trace, and error details.
3. MergeTree Monitoring
Data are stored in Part files; new inserts create new parts, and a background async merge consolidates small parts. Each part generates an index file containing primary‑key values.
Monitoring focuses on abnormal MergeTree behavior and triggers alerts based on exception metrics.
4. Slow‑Query Root‑Cause Analysis
Check whether data reside in page cache (memory access is far faster than disk).
High‑cardinality aggregations, sorting, and JOIN order (small table on the right) can degrade performance; limit partition columns and pre‑sort data.
CPU >70% or memory pressure causes timeouts; disable virtual memory swapping. ClickHouse’s concurrency limits may need tightening.
For buyer‑analysis scenarios, hashing member IDs and applying RoaringBitmap across shards yields several‑fold speedups. Using ClickHouse CTE ( WITH) also achieves real‑time, efficient queries for new‑buyer calculations.
5. Cluster State Monitoring
Track query success, error, and failure counts per cluster/node, with threshold‑based alerts.
Monitor connection count, CPU usage, memory usage, file opens, partition usage, and maximum partition usage per node.
6. Replicas Delay Monitoring
Four replica‑selection algorithms are supported:
Random : choose the replica with the fewest errors.
In order : follow a configured order.
First or random : pick the first replica; if unavailable, select randomly.
Nearest hostname : every 5 minutes compute error counts and select the least‑error replica.
The maximum allowed replica delay is set via max_replica_delay_for_distributed_queries; the HTTP resource /replicas-delay returns 200 OK when delay is within limits.
7. Chproxy Monitoring
Chproxy is ClickHouse’s HTTP proxy and load balancer. Suning manages Chproxy via ClickHouse‑Manger (start/stop, rolling upgrades, monitoring) and synchronizes configuration through ZooKeeper.
Multiple Chproxy instances enable horizontal scaling; they parse SQL, perform intelligent load balancing, and expose metrics such as request queue size, remote client connections, total queries, cancelled/denied requests, cache hit rate, and health status, all observable in the full‑link monitoring platform.
8. Benefits and Outlook
The full‑link monitoring platform deepens observability across data‑application, SparkSQL parsing, OLAP routing, and acceleration layers, providing end‑to‑end latency visibility and proactive alerts that simplify troubleshooting and operational maintenance.
Future work includes integrating additional OLAP engines (Doris, Dremio, RocksDB) and further refining per‑engine execution‑stage timing to enhance the platform’s diagnostic capabilities.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
