Operations 22 min read

Implementing Unified Monitoring Dashboards and Rich‑Text Alerts with Grafana FlowCharting and ImageRender at Meitu

This article explains Meitu's monitoring architecture and presents two practical, low‑effort implementations—a Grafana FlowCharting unified dashboard and a GrafanaImageRender + WeChat Work rich‑text alert solution—detailing step‑by‑step procedures, required tools, and sample code to help SRE teams quickly adopt them.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Implementing Unified Monitoring Dashboards and Rich‑Text Alerts with Grafana FlowCharting and ImageRender at Meitu

1. Abstract

This article introduces Meitu's business background and monitoring system, then presents two practical implementations: a Grafana FlowCharting‑based unified monitoring dashboard and a GrafanaImageRender + WeChat Work rich‑text alert solution, both of which are easy to adopt with minimal code.

2. Overall Monitoring Architecture

2.1 Business Background

Meitu operates both consumer (ToC) and enterprise (ToB) services across photo beautification, makeup, skin management, short video and live streaming, with popular apps such as MeituPic, BeautyPlus, etc., supported by extensive backend services.

Ensuring stability of these services requires a comprehensive monitoring system that covers both client‑side and server‑side metrics.

2.2 Monitoring System

The system is divided into client‑side (APM, network quality, crashes, etc.) and server‑side (load balancer, business services, dependencies, infrastructure). Since Meitu is fully cloud‑native, infrastructure monitoring focuses on cloud resources like ECS, containers, DNS.

Key tools used include Zabbix, OpenFalcon, OpenTSDB, ELKB, InfluxDB TICK, Prometheus, Spark, Storm, Kafka, Flink, Netdata, Sentry, SkyWalking, as well as custom components such as NoticeSystem and MTAlert.

3. Monitoring Dashboard with FlowCharting

3.1 FlowCharting Overview

FlowCharting is a Grafana plugin built on draw.io that allows drawing complex diagrams and binding them to dynamic metrics, enabling a single view of end‑to‑end service health.

3.2 Step‑by‑Step Implementation

a. Draw the diagram

Use draw.io (online or desktop) to create a business architecture diagram, then copy the generated XML code.

b. Import into Grafana

Paste the draw.io source content into the FlowCharting panel’s “Source Content” field.

c. Bind data sources

Add queries for each metric and associate them with diagram elements.

d. Configure display rules

Define rules for colors, thresholds, tooltips, and value mappings.

e. Link elements

Map diagram IDs or labels to metrics, optionally adding links to detailed pages.

f. Complete the dashboard

Repeat the above for all required elements to obtain a live monitoring dashboard.

4. Rich‑Text Alert with GrafanaImageRender and WeChat Work

4.1 Concept

When an alert fires, the system renders the relevant Grafana panel as a PNG image via the ImageRender plugin and sends it together with a text message to a WeChat Work robot.

4.2 Steps

a. Create a robot and obtain webhook

Configure a WeChat Work group robot and note its webhook URL.

b. Get Grafana panel image URL

Use Grafana API keys to request a rendered image URL, e.g. via curl:

curl -H "Accept: application/json" \
    -H "Authorization: grafana_api_key" \
    -d '{"imageUrl":"http://grafana-panel-url/render/d-solo/panelxxxxx"}' \
    "http://grafana-images-server/grafana-images"

The response contains the public PNG URL.

c. Configure alert policy in MTAlert

Store a formatted string containing the Grafana link, webhook, title and description; MTAlert will invoke an extension script on alert.

d. Extension script (Python)

The script parses the formatted string, generates a new panel URL with the current time range, calls ImageRender to obtain the PNG URL, and posts a news‑type message to the robot:

# 3. Send alert
def SendPanelMsg(token, new_panel_url, webhook_url, title, description):
    public_url = GeneratePanelPngurl(token, new_panel_url)
    data = {
        "msgtype": "news",
        "news": {
            "articles": [
                {
                    "title": title,
                    "description": description,
                    "url": public_url,
                    "picurl": public_url
                }
            ]
        }
    }
    # post
    url = webhook_url
    postdata = json.dumps(data)
    request = urllib2.Request(url, postdata)
    request.add_header('Content-Type','application/json')
    try:
        response = urllib2.urlopen(request)
        return 'post Success'
    except Exception as ex:
        return 'post Exception'

def getExtendData(data):
    # parse alert info, split by &&&&&&
    msg_info = json.loads(data)
    msg_data = msg_info['data']
    msg = msg_data['报表地址'.decode('utf-8')]
    msgArr = re.split('&&&&&&', msg)
    if len(msgArr) != 5 :
        print 'mtalert->业务预警配置->grafana链接 配置有误!格式应为: ...'
        return 'true'
    panel_url = msgArr[0]
    webhook_url = msgArr[1]
    title = msgArr[2]
    description = msgArr[3]
    # get token, generate new URL
    org_id = GetOrgId(panel_url)
    token = GetOrgToken(org_id)
    panel_time_range = 30
    new_panel_url = GenerateNewPanelUrl(panel_url, panel_time_range)
    # send
    SendPanelMsg(token, new_panel_url, webhook_url, title, description)
    return 'false'

5. Future Outlook

Monitoring should be seen as a means to achieve stability, cost reduction, and eventually integrate with DataOps/AIOps for intelligent analysis.

6. Conclusion

The article described Meitu’s monitoring architecture, the evolution toward a unified FlowCharting dashboard, and a rich‑text alert workflow using GrafanaImageRender and WeChat Work, providing a low‑cost end‑to‑end monitoring solution.

monitoringoperationsalertingdashboardGrafanaFlowCharting
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.