Operations 22 min read

How Apache HertzBeat Enables Agent‑Free Real‑Time Monitoring and Alerting

This guide introduces Apache HertzBeat, an open‑source real‑time monitoring and alerting platform that requires no agents, supports high‑performance clusters, offers customizable protocols, integrates with Grafana, provides plugin hot‑updates, and details its time‑wheel scheduling, cloud‑edge collaboration, and alert configuration.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
How Apache HertzBeat Enables Agent‑Free Real‑Time Monitoring and Alerting

System Introduction

1.1 Overview

Apache HertzBeat (incubating) is an easy‑to‑use open‑source real‑time monitoring and alerting system that requires no agents, supports high‑performance clusters, is compatible with Prometheus, and provides powerful custom monitoring and status‑page building capabilities.

1.2 Features

Monitoring + Alerting + Notification in one solution, supporting applications, services, databases, caches, OS, big data, middleware, web servers, cloud‑native, network, and custom metrics.

User‑friendly web UI, no learning cost.

Configurable protocols (HTTP, JMX, SSH, SNMP, JDBC, Prometheus) via YML templates.

Prometheus‑compatible monitoring.

High‑performance collector clusters with horizontal scaling, multi‑network isolation, and cloud‑edge collaboration.

Flexible alert thresholds with multiple notification channels (email, Discord, Slack, Telegram, DingTalk, WeChat, Feishu, SMS, Webhook, ServerChan).

Powerful status‑page construction.

1.3 System Architecture Diagram

System Architecture
System Architecture

Practical Guide

2.1 Quick Start (English demo, supports Chinese/English)

1. Enable Actuator

Add dependencies to pom.xml:
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
Configure application.yml to expose endpoints:
management:
  endpoints:
    web:
      exposure:
        include: '*'
      enabled-by-default: true
  metrics:
    export:
      prometheus:
        enabled: true

If Spring Security is present, permit actuator endpoints in the security configuration.

Example security config snippet:
public class SecurityConfig extends WebSecurityConfigurerAdapter{
    @Override
    protected void configure(HttpSecurity http) throws Exception{
        http
            .antMatchers("/actuator/**").permitAll()
            .antMatchers("/metrics/**").permitAll()
            .antMatchers("/trace").permitAll()
            .antMatchers("/heapdump").permitAll();
    }
}

2. Add Monitoring

Navigate: System → Monitoring Center → Add Monitoring → AUTO → Prometheus task
Add Monitoring
Add Monitoring

3. Fill Parameters

Target Host : SpringBoot application address (without protocol). Port : Application service port, e.g., 8080. Endpoint Path : /actuator/prometheus, optional tags such as env=test.
Parameters
Parameters

4. View Monitoring Data

Metrics
Metrics
Metrics
Metrics

5. Configure Alerts

System → Alerts → Threshold Rules → Add → New Threshold
Alert Configuration
Alert Configuration
Two types of thresholds: real‑time and scheduled. Example scheduled rule fields: name, PromQL expression, execution interval, alert level, trigger count, and alert content.

6. Set Threshold Rule

Monitor SpringBoot CPU usage with rule: system_cpu_usage{job="Jolly_Vulture_43vT"} > 0.01
Threshold Rule
Threshold Rule

Triggered alerts appear in the Alert Center.

Alert Center
Alert Center

7. Set Notification

System → Message Notification → Notification Media → Add Receiver
Notification Media
Notification Media
System → Message Notification → Notification Strategy → Add Strategy → Choose Receiver and enable.
Notification Strategy
Notification Strategy

2.2 Grafana Visualization Integration (optional)

1. Grafana Chart Configuration

Enable Grafana embedding and anonymous access.

2. Embed Grafana Dashboard in HertzBeat

After enabling Grafana, restart HertzBeat, upload the Grafana template in the AUTO monitor.
Grafana Integration
Grafana Integration

3. View Grafana Chart

In the AUTO monitor page, click the Grafana icon to view the chart.
Grafana Chart
Grafana Chart

2.3 Plugin Management

2.3.1 Overview

Plugins extend HertzBeat lifecycle actions such as executing SQL, shell scripts after alerts, or forwarding collected data.

Upload a packaged plugin via “Plugin Management → Upload Plugin” for hot‑update without restarting.

2.3.2 Supported Plugin Types

Post‑Alert Plugin : Executes custom logic after an alert (interface org.apache.hertzbeat.plugin.PostAlertPlugin).

Post‑Collect Plugin : Executes custom logic after data collection (interface org.apache.hertzbeat.plugin.PostCollectPlugin).

2.3.3 Demo

Locate the Plugin interface in the plugin module.

Create a class implementing PostAlertPlugin, e.g., DemoPlugin, and implement execute.

Add the fully‑qualified class name to

META-INF/services/org.apache.hertzbeat.plugin.PostAlertPlugin

.

Package the hertzbeat-plugin module.

Upload the -jar-with-lib.jar via Plugin Management to enable hot‑update.

2.3.4 Custom Plugin Parameters

Define parameters in a define‑demo.yml file:

params:
  - field: host
    name:
      zh-CN: 目标 Host
      en-US: Target Host
    type: text
    required: true
  - field: port
    name:
      zh-CN: 端口
      en-US: Port
    type: number
    range: '[0,65535]'

Access parameters in the plugin implementation:

@Override
public void execute(Alert alert, PluginContext pluginContext) {
    log.info("param host:{}", pluginContext.getString("host"));
    log.info("param port:{}", pluginContext.getInteger("port"));
}

Principles

3.1 Task Collection Scheduling

3.1.1 Time‑Wheel Algorithm

HertzBeat uses a time‑wheel algorithm for monitoring task scheduling.

The time wheel is a circular array storing timer tasks; each slot holds a list of tasks. The algorithm advances one tick per second, processes cancelled tasks, transfers timeouts to buckets, and expires due tasks.

// Initialize bucket size 512‑1 for modulo
int mask = 511;
startTime = System.nanoTime();
do {
    long deadline = waitForNextTick();
    if (deadline > 0) {
        int idx = (int) (tick & mask);
        processCancelledTasks();
        HashedWheelBucket bucket = wheel[idx];
        transferTimeoutsToBuckets();
        bucket.expireTimeouts(deadline);
        tick++;
    }
} while (isRunning());

waitForNextTick calculates the next tick deadline, sleeps the required time, and handles Windows timer granularity.

private long waitForNextTick() {
    long deadline = tickDuration * (tick + 1);
    for (;;) {
        long currentTime = System.nanoTime() - startTime;
        long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;
        if (sleepTimeMs <= 0) {
            if (currentTime == Long.MIN_VALUE) {
                return -Long.MAX_VALUE;
            } else {
                return currentTime;
            }
        }
        if (NetworkUtil.isWindowsPlatform()) {
            sleepTimeMs = sleepTimeMs / 10 * 10;
        }
        try {
            Thread.sleep(sleepTimeMs);
        } catch (InterruptedException ignored) {
            if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
                return Long.MIN_VALUE;
            }
        }
    }
}

processCancelledTasks removes cancelled timeouts from buckets.

private void processCancelledTasks() {
    for (;;) {
        HashedWheelTimeout timeout = cancelledTimeouts.poll();
        if (timeout == null) break;
        try {
            timeout.remove();
        } catch (Throwable t) {
            if (logger.isWarnEnabled()) {
                logger.warn("An exception was thrown while process a cancellation task", t);
            }
        }
    }
}

transferTimeoutsToBuckets moves pending timeouts into appropriate buckets based on calculated tick and remaining rounds.

private void transferTimeoutsToBuckets() {
    for (int i = 0; i < 100000; i++) {
        HashedWheelTimeout timeout = timeouts.poll();
        if (timeout == null) break;
        if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) continue;
        long calculated = timeout.deadline / tickDuration;
        timeout.remainingRounds = (calculated - tick) / wheel.length;
        final long ticks = Math.max(calculated, tick);
        int stopIndex = (int) (ticks & mask);
        HashedWheelBucket bucket = wheel[stopIndex];
        bucket.addTimeout(timeout);
    }
}

expireTimeouts executes tasks whose remaining rounds are zero and deadline reached.

void expireTimeouts(long deadline) {
    HashedWheelTimeout timeout = head;
    while (timeout != null) {
        HashedWheelTimeout next = timeout.next;
        if (timeout.remainingRounds <= 0) {
            remove(timeout);
            if (timeout.deadline <= deadline) {
                timeout.expire();
            }
        } else if (timeout.isCancelled()) {
            remove(timeout);
        } else {
            timeout.remainingRounds--;
        }
        timeout = next;
    }
}

3.2 High‑Performance Cluster & Cloud‑Edge Collaboration

3.2.1 Overview

HertzBeat supports collector clusters that scale horizontally, providing exponential monitoring capacity.

Tasks are auto‑scheduled across collectors; a failed collector’s tasks are migrated without service interruption.

Single‑node and cluster modes can be switched without extra components.

3.2.2 Cloud‑Edge Architecture

Edge collector clusters deploy in isolated networks, gather metrics locally, and report to the central HertzBeat service.

The central service orchestrates and displays data from all edges.

3.2.3 Automatic Scheduling

Tasks are assigned using consistent hashing. Each virtual node in the hash ring holds a set of task IDs. When a new collector joins, tasks are rebalanced automatically.

Hash Ring
Hash Ring

3.2.4 Fault‑Tolerant Migration

Netty heartbeats detect collector failures every 5 seconds; upon loss, the collector is removed from the hash ring and tasks are redistributed.

public void collectorGoOffline(String identity) {
    // Update DB status
    consistentHash.removeNode(identity);
    reBalanceCollectorAssignJobs();
}

3.3 Plugin Management

3.3.1 Implementation Principle

Plugins are loaded via SPI with a custom class loader, enabling hot‑update without restarting.

3.3.2 Core Logic

public void savePlugin(PluginUpload pluginUpload) {
    // 1. Save JAR to plugin‑lib directory
    // 2. Validate JAR content
    // 3. Persist metadata
    // 4. Reload class loader
    // 5. Sync plugin status
}

Class loader reload clears previous loaders, performs GC, and loads enabled plugins.

@PostConstruct
private void loadJarToClassLoader() {
    for (URLClassLoader cl : pluginClassLoaders) {
        if (cl != null) cl.close();
    }
    pluginClassLoaders.clear();
    System.gc();
    List<PluginMetadata> plugins = metadataDao.findPluginMetadataByEnableStatusTrue();
    for (PluginMetadata meta : plugins) {
        List<URL> urls = loadLibInPlugin(meta.getJarFilePath(), meta.getId());
        urls.add(new File(meta.getJarFilePath()).toURI().toURL());
        pluginClassLoaders.add(new URLClassLoader(urls.toArray(new URL[0]), Plugin.class.getClassLoader()));
    }
}

Plugins are executed via ServiceLoader:

@Override
public <T> void pluginExecute(Class<T> clazz, Consumer<T> execute) {
    for (URLClassLoader cl : pluginClassLoaders) {
        ServiceLoader<T> load = ServiceLoader.load(clazz, cl);
        for (T t : load) {
            if (pluginIsEnable(t.getClass())) {
                execute.accept(t);
            }
        }
    }
}

Conclusion

Out‑of‑the‑box Docker deployment with minimal configuration.

Agent‑free, web‑only operation with zero learning curve.

Intuitive UI, core functions on top‑level menus.

End‑to‑end encryption for data security.

High performance, protocol‑template configuration, YML custom metrics, horizontal scaling, flexible alerts, and multi‑channel notifications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Monitoringreal-timepluginalertingClusterApacheHertzBeat
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.