How to Build a High‑Availability Nightingale Monitoring System from Scratch
This guide walks through designing a high‑availability architecture for the open‑source Nightingale monitoring platform, covering principles such as stateless services and data redundancy, step‑by‑step cluster setup, database initialization, configuration files, systemd service creation, and HA load balancing with HAProxy to ensure resilient monitoring for modern IT operations.
In modern IT operations, a monitoring system is the "eyes" that ensure service stability. As business scale grows, a single‑node deployment can no longer meet high‑availability and high‑concurrency requirements. This article explains how to build a highly available Nightingale monitoring system, covering architecture design principles, cluster configuration, and load‑balancing strategies.
High Availability Architecture Design Principles
Prefer stateless services: design Nightingale’s Web/API layer as stateless for easy horizontal scaling and failover.
Separate data persistence: store metrics and alerts in independent, highly available back‑ends such as MySQL, VictoriaMetrics, or Prometheus Remote Write‑compatible storage.
Multi‑replica redundancy: deploy at least two n9e instances to avoid single‑point downtime.
Automatic failover: use load balancers like HAProxy or Nginx to route requests to healthy nodes.
Practical: Configure Nightingale Cluster
Prerequisites:
PostgreSQL master‑slave already deployed.
Redis master‑slave/cluster/sentinel already deployed.
1. Create Nightingale project directory
sudo mkdir -p /app/n9e
sudo chown -R ops. /app/n9e2. Download Nightingale program
cd /app/n9e
curl -L -O https://github.com/ccfos/nightingale/releases/download/v8.4.0/n9e-v8.4.0-linux-amd64.tar.gz
tar xf n9e-v8.4.0-linux-amd64.tar.gz3. Create Nightingale database
# Create database user
$ createuser -h 172.139.20.188 -p 9999 -U postgres -W n9e -P
Enter password for new role:
# Set n9e password
Enter it again:
Password: # postgres password
# Create database and grant
$ createdb -h 172.139.20.188 -p 9999 -U postgres -W -O n9e n9e_prod
Password: # postgres password
# Verify connection
$ psql -h 172.139.20.188 -p 9999 -U n9e -W n9e_prod
Password:
psql (14.10)
Type "help" for help.
n9e_prod=> \q4. Import initial tables
$ wget https://raw.githubusercontent.com/ccfos/nightingale/refs/tags/v8.4.0/docker/compose-postgres/initsql_for_postgres/a-n9e-for-Postgres.sql
$ psql -h 172.139.20.188 -p 9999 -U n9e -W n9e_prod -f /app/n9e/a-n9e-for-Postgres.sqlTip: Automatic initialization may fail with “ERROR: type "longtext" does not exist”. Download the SQL file with a VPN if needed.
5. Configure Nightingale (config.toml)
[Log]
Level = INFO
[DB]
DBType = "postgres"
DSN = "host=172.139.20.188 port=9999 user=n9e dbname=n9e_prod password=123456 sslmode=disable"
[Redis]
Address = "172.139.20.199:6379"
Password = "123456"
DB = 9
RedisType = "standalone"6. Create systemd service
cat <<'EOF' | sudo tee /usr/lib/systemd/system/n9e.service > /dev/null
[Unit]
Description=Nightingale Monitoring Service
After=network.target
[Service]
Type=simple
User=ops
ExecStart=/app/n9e/n9e
WorkingDirectory=/app/n9e
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
# Start service
sudo systemctl daemon-reload
sudo systemctl enable n9e.service --nowLoad Balancing
Use HAProxy as the load balancer; core configuration:
listen n9e
bind *:17000
mode tcp
balance roundrobin
server n9e01 172.139.20.181:17000 maxconn 32 check
server n9e02 172.139.20.183:17000 maxconn 32 checkTip: Copy /app/n9e and the service file to other hosts and start the service.
Conclusion
Building a high‑availability Nightingale monitoring system is not just about technology selection but also about operational philosophy—stability comes from redundancy, reliability from design. By separating architecture, deploying redundant components, and intelligently routing traffic, the monitoring system becomes a solid backbone for business continuity.
Linux Ops Smart Journey
The operations journey never stops—pursuing excellence endlessly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
