Operations 6 min read

Log Alarm Optimization and Grafana Chart Integration Guide

This guide details how to configure Alibaba Cloud Log Service alarms—setting one‑day tokens, handling 1024‑byte truncation, removing record limits with analysis statements, adding a 10‑second query offset for timeliness—and shows how to visualize the data in Grafana using SQL queries for multi‑line and pie charts with timestamp conversion and time‑series filling.

37 Interactive Technology Team
37 Interactive Technology Team
37 Interactive Technology Team
Log Alarm Optimization and Grafana Chart Integration Guide

This guide explains how to optimize log alarms in Alibaba Cloud Log Service (SLS) and how to integrate the data with Grafana for visualisation.

1. Log Alarm Configuration

Configure the authorization token with a one‑day expiration as recommended in the official documentation: https://help.aliyun.com/document_detail/346631.html#section-4wz-6i6-dtg .

Log messages longer than 1024 bytes are truncated, which makes it difficult to diagnose alarms. The current workaround is to query only the last 1024 characters of the log content:

__tag__:__path__:/www/logs/abplaygameapi.37gapi.com/error* | select substr(content, length(content)-1024, 1024)

Reference for field length configuration: https://help.aliyun.com/document_detail/209203.html .

Temporary Solutions

Because the most useful error information is at the end of the log, the alarm can be set to display only the trailing 1024 characters.

2. Alarm Query Limits

Without an analysis statement, an alarm can return at most 100 records, which may cause the alarm to fail. Adding an analysis statement removes this limitation (see screenshots in the original document).

3. Alarm Timeliness

SLS queries may have a delay, so when setting the query time window, include an additional 10‑second offset to ensure recent logs are captured. For a 2‑minute window, query the last 2 minutes plus 10 seconds.

Reference: https://help.aliyun.com/document_detail/209188.html .

4. Grafana Chart Integration

4.1 Multi‑Line Chart

Write a SQL query that groups by remote_host and counts page views (PV). Example:

remove_host:gstore.* and cost > 10 | select remove_host, num, to_unixtime(t) as t from (
    select remove_host, COUNT(*) as num, time_series(__time__, '1m', '%Y-%m-%d %H:%i:%s', '0') as t
    from nginx-access-log
    group by t,remove_host order by t limit 10000
)

Functions used:

to_unixtime : converts a datetime to a timestamp (required by Grafana).

time_series : fills missing time points to create a continuous series.

Do not add time constraints (WHERE clauses) in the SQL; let Grafana control the time range.

4.2 Data Display Configuration

In Grafana, set the query to the SQL above. Configure the Y‑column with the field representing the count (e.g., #:#quantity ) and the X‑column with the timestamp field.

4.3 Pie Chart

Write a SQL query to calculate the proportion of each host in the nginx-access-log :

remove_host:gstore.* and cost > 10 | select remove_host, COUNT(*) as num group by remove_host order by num desc

For a pie chart, only the numeric value (num) is needed; no time axis is required.

Configure the Y‑column with the numeric field and leave the X‑column empty or set it to the same numeric field.

SQLoperationsGrafanaalert optimizationcloud loggingLog Monitoring
37 Interactive Technology Team
Written by

37 Interactive Technology Team

37 Interactive Technology Center

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.