Operations 7 min read

Monitoring TiDB with Zabbix: Using HTTP Agent, Preprocessing, and Triggers

This guide explains how to collect TiDB metrics via its HTTP monitoring API, preprocess the data into JSON, create master and regular items in Zabbix, and configure triggers using Prometheus‑style expressions to achieve effective TiDB monitoring.

Aikesheng Open Source Community

Nov 19, 2021

Monitoring TiDB with Zabbix: Using HTTP Agent, Preprocessing, and Triggers

If you want to monitor TiDB with Zabbix, you need to use the HTTP agent to call TiDB's monitoring API and then preprocess the returned data. The required functionality (Prometheus pattern or Prometheus to JSON) was added in Zabbix 4.2, so the example uses Zabbix 5.0.5.

TiDB Monitoring API

Before starting, read the TiDB monitoring API documentation (https://docs.pingcap.com/zh/tidb/v5.1/tidb-monitoring-api).

Example request: curl http://127.0.0.1:10080/metrics > /tmp/tidb_metics One of the alert rules from the TiDB docs is:

increase(tidb_session_schema_lease_error_total{type="outdated"}[15m]) > 0

The metric name tidb_session_schema_lease_error_total can be found in the exported metrics file; its format includes a HELP line, a TYPE line, and the metric value.

Creating Items

An item is a monitoring metric. First create a master item that calls the TiDB /metrics endpoint and retrieves all metrics as plain text.

Then create regular items that extract a single metric from the master item using JSONPath preprocessing. Example JSONPath expression:

$[?(@.name=="tidb_session_schema_lease_error_total" && @.labels.type == "outdated")].value.first()

Because the metric type is Counter, set the item type to “Change per second” to get the per‑second growth; for Gauge metrics this step is unnecessary.

Creating Triggers

A trigger defines when an item’s value should raise an alarm. Using the TiDB alert rule syntax, the trigger expression becomes:

{TiDB by HTTP:tidb.session_schema_lease_error.outdate.rate.max(15m)}>0

This fires when the maximum per‑second increase of the metric over a 15‑minute window exceeds zero, indicating an error.

Appendix – JSONPath Examples

Sample data:

[{"name":"tidb_server_handle_query_duration_seconds_sum","value":"100","labels":{"sql_type":"Begin"}},{"name":"tidb_server_handle_query_duration_seconds_sum","value":"50","labels":{"sql_type":"Commit"}}]

JSONPath to sum all values:

$[?(@.name=="tidb_server_handle_query_duration_seconds_sum")].value.sum()

JSONPath to get the first value for a specific label:

$[?(@.name=="tidb_server_handle_query_duration_seconds_sum" && @.labels.sql_type=="Commit")].value.first()

JSONPath to sum values for multiple labels:

$[?(@.name=="tidb_server_handle_query_duration_seconds_sum" && @.labels.type =~ "Begin|Commit")].value.sum()

Additional Tips

Memory usage trigger example:

{TiDB by HTTP:tidb.heap_bytes.min(5m)}>{$TIDB.HEAP.USAGE.MAX.WARN}

99th percentile response time can be calculated in Prometheus with

histogram_quantile(0.99, sum(rate(tidb_server_handle_query_duration_seconds_bucket[1m])) BY (le, instance)) > 1

, but Zabbix cannot process histograms directly, so you may compute average response time using calculated items.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring metrics Alerting Prometheus TiDB JsonPath Zabbix

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.