Implementing Unified Monitoring Dashboards and Rich‑Text Alerts with Grafana FlowCharting and ImageRender at Meitu
This article explains Meitu's monitoring architecture and presents two practical, low‑effort implementations—a Grafana FlowCharting unified dashboard and a GrafanaImageRender + WeChat Work rich‑text alert solution—detailing step‑by‑step procedures, required tools, and sample code to help SRE teams quickly adopt them.
1. Abstract
This article introduces Meitu's business background and monitoring system, then presents two practical implementations: a Grafana FlowCharting‑based unified monitoring dashboard and a GrafanaImageRender + WeChat Work rich‑text alert solution, both of which are easy to adopt with minimal code.
2. Overall Monitoring Architecture
2.1 Business Background
Meitu operates both consumer (ToC) and enterprise (ToB) services across photo beautification, makeup, skin management, short video and live streaming, with popular apps such as MeituPic, BeautyPlus, etc., supported by extensive backend services.
Ensuring stability of these services requires a comprehensive monitoring system that covers both client‑side and server‑side metrics.
2.2 Monitoring System
The system is divided into client‑side (APM, network quality, crashes, etc.) and server‑side (load balancer, business services, dependencies, infrastructure). Since Meitu is fully cloud‑native, infrastructure monitoring focuses on cloud resources like ECS, containers, DNS.
Key tools used include Zabbix, OpenFalcon, OpenTSDB, ELKB, InfluxDB TICK, Prometheus, Spark, Storm, Kafka, Flink, Netdata, Sentry, SkyWalking, as well as custom components such as NoticeSystem and MTAlert.
3. Monitoring Dashboard with FlowCharting
3.1 FlowCharting Overview
FlowCharting is a Grafana plugin built on draw.io that allows drawing complex diagrams and binding them to dynamic metrics, enabling a single view of end‑to‑end service health.
3.2 Step‑by‑Step Implementation
a. Draw the diagram
Use draw.io (online or desktop) to create a business architecture diagram, then copy the generated XML code.
b. Import into Grafana
Paste the draw.io source content into the FlowCharting panel’s “Source Content” field.
c. Bind data sources
Add queries for each metric and associate them with diagram elements.
d. Configure display rules
Define rules for colors, thresholds, tooltips, and value mappings.
e. Link elements
Map diagram IDs or labels to metrics, optionally adding links to detailed pages.
f. Complete the dashboard
Repeat the above for all required elements to obtain a live monitoring dashboard.
4. Rich‑Text Alert with GrafanaImageRender and WeChat Work
4.1 Concept
When an alert fires, the system renders the relevant Grafana panel as a PNG image via the ImageRender plugin and sends it together with a text message to a WeChat Work robot.
4.2 Steps
a. Create a robot and obtain webhook
Configure a WeChat Work group robot and note its webhook URL.
b. Get Grafana panel image URL
Use Grafana API keys to request a rendered image URL, e.g. via curl:
curl -H "Accept: application/json" \
-H "Authorization: grafana_api_key" \
-d '{"imageUrl":"http://grafana-panel-url/render/d-solo/panelxxxxx"}' \
"http://grafana-images-server/grafana-images"The response contains the public PNG URL.
c. Configure alert policy in MTAlert
Store a formatted string containing the Grafana link, webhook, title and description; MTAlert will invoke an extension script on alert.
d. Extension script (Python)
The script parses the formatted string, generates a new panel URL with the current time range, calls ImageRender to obtain the PNG URL, and posts a news‑type message to the robot:
# 3. Send alert
def SendPanelMsg(token, new_panel_url, webhook_url, title, description):
public_url = GeneratePanelPngurl(token, new_panel_url)
data = {
"msgtype": "news",
"news": {
"articles": [
{
"title": title,
"description": description,
"url": public_url,
"picurl": public_url
}
]
}
}
# post
url = webhook_url
postdata = json.dumps(data)
request = urllib2.Request(url, postdata)
request.add_header('Content-Type','application/json')
try:
response = urllib2.urlopen(request)
return 'post Success'
except Exception as ex:
return 'post Exception'
def getExtendData(data):
# parse alert info, split by &&&&&&
msg_info = json.loads(data)
msg_data = msg_info['data']
msg = msg_data['报表地址'.decode('utf-8')]
msgArr = re.split('&&&&&&', msg)
if len(msgArr) != 5 :
print 'mtalert->业务预警配置->grafana链接 配置有误!格式应为: ...'
return 'true'
panel_url = msgArr[0]
webhook_url = msgArr[1]
title = msgArr[2]
description = msgArr[3]
# get token, generate new URL
org_id = GetOrgId(panel_url)
token = GetOrgToken(org_id)
panel_time_range = 30
new_panel_url = GenerateNewPanelUrl(panel_url, panel_time_range)
# send
SendPanelMsg(token, new_panel_url, webhook_url, title, description)
return 'false'5. Future Outlook
Monitoring should be seen as a means to achieve stability, cost reduction, and eventually integrate with DataOps/AIOps for intelligent analysis.
6. Conclusion
The article described Meitu’s monitoring architecture, the evolution toward a unified FlowCharting dashboard, and a rich‑text alert workflow using GrafanaImageRender and WeChat Work, providing a low‑cost end‑to‑end monitoring solution.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.