Backend Development 18 min read

How to Build a Flexible API Monitoring Exporter with Gin-Vue-Admin and Prometheus

This article walks through extending a simple Prometheus Exporter into a full-featured API monitoring solution using Gin-Vue-Admin, detailing backend task scheduling, database schema, multi-protocol checks (HTTP, TCP, DNS, ICMP), dynamic cron management, and frontend integration for managing and visualizing health metrics.

Ops Development Stories
Ops Development Stories
Ops Development Stories
How to Build a Flexible API Monitoring Exporter with Gin-Vue-Admin and Prometheus

Previously a simple Prometheus Exporter was created. The new version adds several features:

Interface management via a frontend page with data stored in a database

Response validation in addition to status‑code checks

Frontend display of interface availability percentage

Flexible configuration of probe items, including adjustable frequency, result validation, and enable/disable control

The implementation follows a straightforward flow:

User creates a probe task, which is saved to the database

The backend registers a scheduled job for the new task

A backend goroutine watches for updates or deletions and adjusts the scheduled jobs accordingly

The probe task generates Prometheus metrics for monitoring and alerting

Backend Implementation

Define the database model for a probe task:

<code>type DialApi struct {
    global.GVA_MODEL
    Name           string `json:"name" form:"name" gorm:"column:name;default:'';comment:接口名称;size:32;"`
    Type           string `json:"type" form:"type" gorm:"column:type;default:'';comment:拨测类型 HTTP TCP PING DNS;size:8;"`
    HttpMethod     string `json:"httpMethod" form:"httpMethod" gorm:"column:http_method;default:GET;comment:HTTP请求方法;size:8;"`
    Url            string `json:"url" form:"url" gorm:"column:url;comment:拨测地址;size:255;" binding:"required"`
    RequestBody    string `json:"requestBody" form:"requestBody" gorm:"column:request_body;comment:请求BODY;size:255;"`
    Enabled        *bool  `json:"enabled" form:"enabled" gorm:"column:enabled;default:false;comment:是否启用;" binding:"required"`
    Application    string `json:"application" form:"application" gorm:"column:application;comment:所属应用;size:32;"`
    ExceptResponse string `json:"exceptResponse" form:"exceptResponse" gorm:"column:except_response;comment:预期返回值;size:32;"`
    HttpStatus     int    `json:"httpStatus" form:"httpStatus" gorm:"column:http_status;type:smallint(5);default:200;comment:预期状态码;size:16;"`
    Cron           string `json:"cron" form:"cron" gorm:"column:cron;comment:cron表达式;size:20;"`
    SuccessRate    string `json:"successRate" form:"successRate" gorm:"column:success_rate;comment:拨测成功率"`
    CreatedBy      uint   `gorm:"column:created_by;comment:创建者"`
    UpdatedBy      uint   `gorm:"column:updated_by;comment:更新者"`
    DeletedBy      uint   `gorm:"column:deleted_by;comment:删除者"`
}
</code>

The struct captures fields such as the probe URL, expected response, status code, and cron expression, all of which are filled in through the frontend.

For newly created tasks, a

Run

method registers a scheduled job using the timer utilities of

gin-vue-admin

:

<code>func (j *StartDialApi) Run() {
    var dialService = service.ServiceGroupApp.DialApiServiceGroup.DialApiService
    pageInfo := dialApiReq.DialApiSearch{}
    dialApiInfoList, _, err := dialService.GetDialApiInfoList(pageInfo)
    if err == nil {
        var option []cron.Option
        option = append(option, cron.WithSeconds())
        for _, dialApi := range dialApiInfoList {
            c := utils.ConvertToCronExpression(dialApi.Cron)
            dialApi.Cron = c
            dialService.AddSingleDialApiTimerTask(dialApi)
        }
    } else {
        global.GVA_LOG.Error("获取拨测任务列表失败")
    }
}
</code>

This method checks whether a scheduled task already exists; if not and the task is enabled, it adds the task to the timer.

The core timer‑addition logic is encapsulated in

AddSingleDialApiTimerTask

:

<code>func (dialService *DialApiService) AddSingleDialApiTimerTask(dialApiEntity dialApi.DialApi) {
    var option []cron.Option
    option = append(option, cron.WithSeconds())
    idStr := strconv.Itoa(int(dialApiEntity.ID))
    cronName := global.DIAL_API + idStr
    taskName := global.DIAL_API + idStr
    task, found := global.GVA_Timer.FindTask(cronName, taskName)
    if !found {
        if *dialApiEntity.Enabled {
            _, err := global.GVA_Timer.AddTaskByFunc(cronName, dialApiEntity.Cron, func() {
                global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(0)
                global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(0)
                switch dialApiEntity.Type {
                case "HTTP":
                    ok := checkHTTP(dialApiEntity)
                    if ok {
                        global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(1)
                    } else {
                        global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(1)
                    }
                    logHealthCheckResult(ok, nil, dialApiEntity, "HTTP")
                    getSuccessRateFromPrometheus(dialApiEntity)
                case "TCP", "DNS", "ICMP":
                    var ok bool
                    var err error
                    switch dialApiEntity.Type {
                    case "TCP":
                        ok, err = checkTCP(dialApiEntity)
                    case "DNS":
                        ok, err = checkDNS(dialApiEntity)
                    case "ICMP":
                        ok, err = checkICMP(dialApiEntity)
                    }
                    if ok {
                        global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "success").Add(1)
                    } else {
                        global.HealthCheckResults.WithLabelValues(dialApiEntity.Name, dialApiEntity.Type, "failed").Add(1)
                    }
                    logHealthCheckResult(ok, err, dialApiEntity, dialApiEntity.Type)
                    getSuccessRateFromPrometheus(dialApiEntity)
                default:
                    global.GVA_LOG.Error("未知的检测类型", zap.String("DetectType", dialApiEntity.Type))
                }
            }, global.DIAL_API+idStr, option...)
            if err != nil {
                global.GVA_LOG.Error(fmt.Sprintf("添加拨测定时任务失败: %s : %s , 原因是: %s", idStr, dialApiEntity.Name, err.Error()))
            }
        }
    } else {
        if task.Spec != dialApiEntity.Cron {
            global.GVA_LOG.Info(fmt.Sprintf("修改定时任务时间: %s", dialApiEntity.Name))
            global.GVA_Timer.Clear(global.DIAL_API + idStr)
            dialService.AddSingleDialApiTimerTask(dialApiEntity)
        } else if !*dialApiEntity.Enabled || dialApiEntity.DeletedAt.Valid {
            global.GVA_LOG.Info(fmt.Sprintf("停止拨测任务: %s", dialApiEntity.Name))
            global.GVA_Timer.RemoveTaskByName(cronName, taskName)
        }
    }
}
</code>

When a task runs, it records success/failure metrics, logs the result, and updates the success rate by querying Prometheus:

<code>func getSuccessRateFromPrometheus(dialApiEntity dialApi.DialApi) {
    successQuery := fmt.Sprintf(`sum(rate(health_check_results{name="%s", type="%s", status="success"}[1h]))`, dialApiEntity.Name, dialApiEntity.Type)
    totalQuery := fmt.Sprintf(`sum(rate(health_check_results{name="%s", type="%s"}[1h]))`, dialApiEntity.Name, dialApiEntity.Type)
    successResponse, err := utils.QueryPrometheus(successQuery, global.GVA_CONFIG.Prometheus.Address)
    if err != nil {
        global.GVA_LOG.Error("Failed to query success rate from Prometheus", zap.Error(err))
        return
    }
    totalResponse, err := utils.QueryPrometheus(totalQuery, global.GVA_CONFIG.Prometheus.Address)
    if err != nil {
        global.GVA_LOG.Error("Failed to query total rate from Prometheus", zap.Error(err))
        return
    }
    var successValue, totalValue float64
    if len(successResponse.Data.Result) > 0 {
        for _, result := range successResponse.Data.Result {
            if v, ok := result.Value[1].(string); ok {
                if f, err := strconv.ParseFloat(v, 64); err == nil {
                    successValue = f
                }
            }
        }
    }
    if len(totalResponse.Data.Result) > 0 {
        for _, result := range totalResponse.Data.Result {
            if v, ok := result.Value[1].(string); ok {
                if f, err := strconv.ParseFloat(v, 64); err == nil {
                    totalValue = f
                }
            }
        }
    }
    if totalValue > 0 {
        successRate := CalculateSuccessRate(successValue, totalValue)
        var dialService = DialApiService{}
        dial, err := dialService.GetDialApi(strconv.Itoa(int(dialApiEntity.ID)))
        if err != nil {
            global.GVA_LOG.Error("获取任务失败", zap.String("err", err.Error()))
            return
        }
        successRateStr := fmt.Sprintf("%.2f", successRate)
        if dial.SuccessRate != successRateStr {
            dial.SuccessRate = successRateStr
            if err := dialService.UpdateDialApi(dial); err != nil {
                global.GVA_LOG.Error("更新任务成功率失败", zap.String("err", err.Error()))
                return
            }
        }
    }
}

func CalculateSuccessRate(success, total float64) float64 {
    if total == 0 {
        return 0
    }
    return (success / total) * 100
}
</code>

Protocol‑specific check functions are provided for HTTP, TCP, DNS, and ICMP:

<code>func checkHTTP(dialApiEntity dialApi.DialApi) bool {
    idStr := strconv.Itoa(int(dialApiEntity.ID))
    var response *http.Response
    var httpErr error
    switch dialApiEntity.HttpMethod {
    case "GET":
        response, httpErr = http.Get(dialApiEntity.Url)
    case "POST":
        response, httpErr = http.Post(dialApiEntity.Url, "application/json", strings.NewReader(dialApiEntity.RequestBody))
    }
    if response != nil {
        if httpErr == nil && response.StatusCode == dialApiEntity.HttpStatus {
            if dialApiEntity.ExceptResponse != "" {
                bodyBytes, err := io.ReadAll(response.Body)
                if err != nil {
                    return false
                }
                return strings.Contains(string(bodyBytes), dialApiEntity.ExceptResponse)
            }
            return true
        }
        global.GVA_LOG.Info(idStr+":"+dialApiEntity.Name+"拨测结果与预期不一致")
        return false
    }
    global.GVA_LOG.Error("拨测失败: "+dialApiEntity.Url)
    return false
}

func checkTCP(dialApiEntity dialApi.DialApi) (bool, error) {
    conn, err := net.DialTimeout("tcp", dialApiEntity.Url, 5*time.Second)
    if err != nil {
        return false, err
    }
    defer conn.Close()
    return true, nil
}

func checkDNS(dialApiEntity dialApi.DialApi) (bool, error) {
    _, err := net.LookupHost(dialApiEntity.Url)
    if err != nil {
        return false, err
    }
    return true, nil
}

func checkICMP(dialApiEntity dialApi.DialApi) (bool, error) {
    pinger, err := ping.NewPinger(dialApiEntity.Url)
    if err != nil {
        return false, err
    }
    pinger.Count = 2
    if err = pinger.Run(); err != nil {
        return false, err
    }
    return true, nil
}
</code>

A background goroutine watches update and delete channels to modify or remove scheduled jobs accordingly:

<code>func startUpdateDialCron() {
    var dialService = service.ServiceGroupApp.DialApiServiceGroup.DialApiService
    for {
        select {
        case updateId := <-global.UpdateDialAPIChannel:
            if updateId != "" {
                dial, err := dialService.GetDialApi(updateId)
                if err != nil {
                    global.GVA_LOG.Error("获取任务失败", zap.String("err", err.Error()))
                    continue
                }
                global.GVA_LOG.Info("更新定时任务", zap.String("updateId", updateId))
                cronName := global.DIAL_API + updateId
                taskName := global.DIAL_API + updateId
                if _, found := global.GVA_Timer.FindTask(cronName, taskName); found {
                    global.GVA_Timer.Clear(cronName)
                    c := utils.ConvertToCronExpression(dial.Cron)
                    dial.Cron = c
                    dialService.AddSingleDialApiTimerTask(dial)
                }
            }
        case deleteId := <-global.DeleteDialAPIChannel:
            if deleteId != "" {
                cronName := global.DIAL_API + deleteId
                taskName := global.DIAL_API + deleteId
                if _, found := global.GVA_Timer.FindTask(cronName, taskName); found {
                    global.GVA_LOG.Info("删除定时任务", zap.String("updateId", deleteId))
                    global.GVA_Timer.RemoveTaskByName(cronName, taskName)
                }
            }
        }
    }
}
</code>

Frontend Display

A simple Vue page is built to manage probe tasks, allowing creation, editing, enabling/disabling, and viewing of success rates.

Frontend task list
Frontend task list

The interface also shows detailed task information and provides controls to toggle task status.

Task detail view
Task detail view

Monitoring and Alerting

Metrics are exposed to Prometheus under the name

health_check_results

. Creating a Prometheus job for this exporter allows collection of success/failure counts.

Prometheus job configuration
Prometheus job configuration

An alert rule can be defined to trigger when the success rate falls below a threshold (e.g., 100%).

Alert rule example
Alert rule example

With these components, the system provides a complete API monitoring solution that stores configuration, schedules checks, records results in Prometheus, and visualizes success rates on the frontend.

backendGoPrometheusCronGinAPI monitoring
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.