Cloud Native 14 min read

Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices

KubeDoor 1.0 introduces a new architecture for unified multi‑Kubernetes monitoring, offering components for master and agent, flexible deployment options, Helm‑based installation, configurable storage and alerting settings, and detailed guidance on integrating with existing Prometheus/VictoriaMetrics setups while providing automatic peak‑usage data collection.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices

KubeDoor 1.0 Release Overview

KubeDoor 1.0 provides a brand‑new architecture that supports multiple Kubernetes clusters, unified monitoring, remote‑write aggregation, centralized alert rules, and best‑practice visualizations.

Component Overview

Master side (installed in the kubedoor namespace)

kubedoor-master

: connects to agents and exposes API interfaces.

kubedoor-web

: front‑end UI integrated with Nginx.

kubedoor-dash

: Grafana dashboard.

kubedoor-alarm

: receives alerts from Alertmanager, handles notifications and storage.

kubedoor-collect

: scheduled jobs that pull high‑peak resource data via API.

Infrastructure (default deployed together with master)

alertmanager

: alert routing service.

vmalert

: evaluates alert rules, forwards triggered alerts to Alertmanager.

VictoriaMetrics

: time‑series database.

ClickHouse

: column‑ariented database.

Agent side (installed in the kubedoor namespace)

kubedoor-agent

: connects to master, calls Kubernetes API.

vmagent

: replaces Prometheus for metric collection.

KubeStateMetrics

: gathers Kubernetes metrics.

NodeExporter

: collects host‑level metrics.

Deployment Options

Fresh deployment (no existing monitoring system) : ideal when the target cluster lacks any monitoring stack.

Independent ClickHouse & VictoriaMetrics deployment : use Docker Compose to run the databases on the host.

Integration with existing multi‑K8s monitoring : configure Helm values to point to your existing Prometheus/vmagent and remote‑write endpoints.

Master‑only deployment : when a complete monitoring system already exists and you only need KubeDoor’s UI and alert aggregation.

Key Configuration Variables (master values-master.yaml )

storageClass

: specify the storage class for ClickHouse and VictoriaMetrics persistent volumes.

CK_PASSWORD

: password for ClickHouse

default

user (default can be kept).

external_labels_key

: label key added to every metric for cluster identification; must match the agent side.

MSG_TYPE/MSG_TOKEN

: IM notification type and bot token for alert notifications.

vm_single

: credentials and retention settings for VictoriaMetrics single‑node mode.

nginx_auth

: basic‑auth credentials for the web UI and agent‑master communication.

Installation Commands

<code># Download Helm chart
wget https://StarsL.cn/kubedoor/kubedoor-1.1.0.tgz
 tar -zxvf kubedoor-1.1.0.tgz
 cd kubedoor</code>
<code># Master dry‑run
helm install kubedoor . --namespace kubedoor --create-namespace \
  --values values-master.yaml --dry-run --debug
# Master install
helm install kubedoor . --namespace kubedoor --create-namespace \
  --values values-master.yaml</code>
<code># Agent dry‑run
helm install kubedoor-agent . --namespace kubedoor --create-namespace \
  --values values-agent.yaml --dry-run --debug
# Agent install
helm install kubedoor-agent . --namespace kubedoor --create-namespace \
  --values values-agent.yaml</code>

Accessing the Web UI

Open

http://&lt;NodeIP&gt;:&lt;NodePort&gt;

in a browser. The default username and password are both

kubedoor

.

Alert Flow

Metrics are stored in VictoriaMetrics.

vmalert

reads the data, evaluates rules, and sends triggered alerts to

alertmanager

. Alertmanager routes them to

kubedoor-alarm

, which records the alerts and sends notifications. KubeDoor installs its own Alertmanager instance in the

kubedoor

namespace, avoiding conflicts with any existing Alertmanager.

Agent Configuration Details (agent values-agent.yaml )

ws

: WebSocket endpoint for master‑agent communication, e.g.,

ws://kubedoor-master.kubedoor

. Use a reachable external address for cross‑cluster setups.

MSG_TYPE/MSG_TOKEN

: bot type and token for sending operation notifications.

OSS_URL

: OSS endpoint where Java services upload dump/JFR/JStack data.

external_labels_key

&

external_labels_value

: must match the labels configured in your Prometheus/vmagent for multi‑cluster identification.

remoteWriteUrl

: full URL for vmagent remote‑write to VictoriaMetrics, e.g.,

http://monit:[email protected]:8428/api/v1/write

.

monit.enable

and related

enable

flags: set to

true

only if you need vmagent; otherwise keep them

false

to avoid duplication with existing Prometheus instances.

Data Collection

After enabling automatic collection, KubeDoor gathers the previous day’s peak usage at 01:00 daily and writes the top‑consumption day of the last ten days into the control table. Manual collection can be triggered from the UI; repeated runs do not duplicate data.

If the system is newly installed and the current day’s peak period has already passed, data for that day cannot be collected until the next day’s peak window.

Prometheus/vmagent Job Requirements

Ensure the following metrics are exposed for proper analysis:

container_cpu_usage_seconds_total

container_memory_working_set_bytes

container_spec_cpu_quota

kube_pod_container_info

kube_pod_container_resource_limits

kube_pod_container_resource_requests

Repository

GitHub: https://github.com/CassInfra/KubeDoor/tree/main

MonitoringCloud NativeKubernetesAlertingClickHouseVictoriaMetricshelm
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.