Operations 7 min read

How to Build a High‑Availability Nightingale Monitoring System from Scratch

This guide walks through designing a high‑availability architecture for the open‑source Nightingale monitoring platform, covering principles such as stateless services and data redundancy, step‑by‑step cluster setup, database initialization, configuration files, systemd service creation, and HA load balancing with HAProxy to ensure resilient monitoring for modern IT operations.

Linux Ops Smart Journey
Linux Ops Smart Journey
Linux Ops Smart Journey
How to Build a High‑Availability Nightingale Monitoring System from Scratch

In modern IT operations, a monitoring system is the "eyes" that ensure service stability. As business scale grows, a single‑node deployment can no longer meet high‑availability and high‑concurrency requirements. This article explains how to build a highly available Nightingale monitoring system, covering architecture design principles, cluster configuration, and load‑balancing strategies.

High Availability Architecture Design Principles

Prefer stateless services: design Nightingale’s Web/API layer as stateless for easy horizontal scaling and failover.

Separate data persistence: store metrics and alerts in independent, highly available back‑ends such as MySQL, VictoriaMetrics, or Prometheus Remote Write‑compatible storage.

Multi‑replica redundancy: deploy at least two n9e instances to avoid single‑point downtime.

Automatic failover: use load balancers like HAProxy or Nginx to route requests to healthy nodes.

Practical: Configure Nightingale Cluster

Prerequisites:

PostgreSQL master‑slave already deployed.

Redis master‑slave/cluster/sentinel already deployed.

1. Create Nightingale project directory

sudo mkdir -p /app/n9e
sudo chown -R ops. /app/n9e

2. Download Nightingale program

cd /app/n9e
curl -L -O https://github.com/ccfos/nightingale/releases/download/v8.4.0/n9e-v8.4.0-linux-amd64.tar.gz
tar xf n9e-v8.4.0-linux-amd64.tar.gz

3. Create Nightingale database

# Create database user
$ createuser -h 172.139.20.188 -p 9999 -U postgres -W n9e -P
Enter password for new role:
# Set n9e password
Enter it again:
Password: # postgres password
# Create database and grant
$ createdb -h 172.139.20.188 -p 9999 -U postgres -W -O n9e n9e_prod
Password: # postgres password
# Verify connection
$ psql -h 172.139.20.188 -p 9999 -U n9e -W n9e_prod
Password:
psql (14.10)
Type "help" for help.
n9e_prod=> \q

4. Import initial tables

$ wget https://raw.githubusercontent.com/ccfos/nightingale/refs/tags/v8.4.0/docker/compose-postgres/initsql_for_postgres/a-n9e-for-Postgres.sql
$ psql -h 172.139.20.188 -p 9999 -U n9e -W n9e_prod -f /app/n9e/a-n9e-for-Postgres.sql

Tip: Automatic initialization may fail with “ERROR: type "longtext" does not exist”. Download the SQL file with a VPN if needed.

5. Configure Nightingale (config.toml)

[Log]
Level = INFO

[DB]
DBType = "postgres"
DSN = "host=172.139.20.188 port=9999 user=n9e dbname=n9e_prod password=123456 sslmode=disable"

[Redis]
Address = "172.139.20.199:6379"
Password = "123456"
DB = 9
RedisType = "standalone"

6. Create systemd service

cat <<'EOF' | sudo tee /usr/lib/systemd/system/n9e.service > /dev/null
[Unit]
Description=Nightingale Monitoring Service
After=network.target

[Service]
Type=simple
User=ops
ExecStart=/app/n9e/n9e
WorkingDirectory=/app/n9e
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

# Start service
sudo systemctl daemon-reload
sudo systemctl enable n9e.service --now

Load Balancing

Use HAProxy as the load balancer; core configuration:

listen n9e
bind *:17000
 mode tcp
 balance roundrobin
 server n9e01 172.139.20.181:17000 maxconn 32 check
 server n9e02 172.139.20.183:17000 maxconn 32 check

Tip: Copy /app/n9e and the service file to other hosts and start the service.

Conclusion

Building a high‑availability Nightingale monitoring system is not just about technology selection but also about operational philosophy—stability comes from redundancy, reliability from design. By separating architecture, deploying redundant components, and intelligently routing traffic, the monitoring system becomes a solid backbone for business continuity.

deploymentHAProxysystemdnightingale
Linux Ops Smart Journey
Written by

Linux Ops Smart Journey

The operations journey never stops—pursuing excellence endlessly.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.