Master Zabbix: From Installation to Advanced Custom Monitoring and Alerts
This comprehensive guide explains why server monitoring is essential, details the concept of high‑availability "nines", walks through Zabbix installation, web UI setup, custom item creation, trigger configuration, alert integration, distributed monitoring, SNMP support, and practical scripts for managing large‑scale server farms.
1. Monitoring Overview
1.1 Why Monitor
Monitoring alerts you when a server encounters problems, helps you locate the root cause, and ensures website/server availability.
1.1.1 Website Availability
High availability (HA) is often expressed as a number of "nines" (3‑5 nines). The number of nines represents the percentage of uptime over a year and can be calculated to understand the maximum possible downtime.
1‑nine (90%): 36.5 days of downtime per year 2‑nine (99%): 3.65 days per year 3‑nine (99.9%): 8.76 hours per year 4‑nine (99.99%): 52.6 minutes per year 5‑nine (99.999%): 5.26 minutes per year 6‑nine (99.9999%): 31 seconds per year
1.2 What to Monitor
Monitor anything that can be queried with a command – essentially everything you can think of.
1.2.1 Monitoring Scope
1.3 Remote Management Cards
If you need remote management, use cards such as Dell iDRAC, HP iLO, IBM IMM.
2. Installing Zabbix
2.1 Environment Check
2.2 Installation Methods
Compile installation (many services, complex environment)
Yum installation (clean environment)
Yum requires a repository mirror, e.g., http://www.cnblogs.com/clsn/p/7866643.html
2.3 Server‑Side Quick Install Script
2.4 Client‑Side Quick Deploy Script
2.5 Connectivity Test
Install the Zabbix‑get tool on the server:
<code>yum install zabbix-get</code>3. Web Interface Operations
3.1 Zabbix Web Installation
Access the setup page with a browser:
http://10.0.0.61/zabbix/setup.php
During detection, view specific error messages to troubleshoot.
Select MySQL database and enter the password.
Host and port usually do not need changes; give a custom name.
Confirm the information and click Next.
Installation finishes; click Finish.
Login with username Admin and password zabbix (note capital A).
3.2 Adding Monitoring Items
3.2.1 Modify Zabbix Server Host
Configuration → Hosts
Host name must match the actual hostname; visible name is for UI display.
Enable the host after editing.
Host now appears in the monitoring list.
3.2.2 Add New Host
Configuration → Hosts → Create Host
Check the box to enable the host.
Add a template (e.g., Linux OS) and select the appropriate items.
Host appears with two monitoring entries.
3.2.3 View Monitoring Data
Navigate to Monitoring → Latest Data and filter as needed.
Search by IP or name to locate the host.
All monitoring items are listed below.
3.2.4 View Graphs
Monitoring → Graphs → select the host and the desired graph.
4. Custom Monitoring and Alerts
4.1 Custom Monitoring
Zabbix provides the Template OS Linux (Template App Zabbix Agent) which includes CPU, memory, disk, and network monitoring. New items can be added, for example, to alert when the number of logged‑in users exceeds three.
4.2 Implementing Custom Monitoring
4.2.1 Custom Syntax
4.2.2 Agent Registration
4.2.3 Server‑Side Registration (Web UI)
Create a template: Configuration → Templates → Create Template.
Add the template.
Create an application set (acts like a folder) to categorize items: Configuration → Applications → Create Application.
Create a monitoring item: Monitoring → Items → Create Item.
The key for the item is the previously defined
login-user.
When creating the item, select the appropriate application set.
4.2.4 Create Trigger
A trigger fires an alarm when the item value meets a condition.
Define the expression by selecting the previously created item; the latest value is represented by
T.
After adding, the trigger appears in the trigger list.
4.2.5 Create Graph
Graph → Create Graph, give a name and associate the monitoring items.
4.2.6 Link Template to Host
Configuration → Hosts → select a host; a host can be linked to multiple templates.
4.2.7 View Monitoring Graphs
4.3 Monitoring Alerts
4.3.1 Third‑Party Alert Platform
OneAlert (http://www.onealert.com) provides SMS, WeChat, QQ, and phone notifications with scheduling and escalation.
4.3.2 OneAlert Configuration
Add an application named zabbix in OneAlert.
WeChat alerts require following the public account.
4.3.3 Install OneAlert Agent
4.3.4 Remove OneAlert Agent
Delete the script from media types, delete the created user, delete the user group, and delete the action.
4.3.5 Trigger Response – Send Alert
Alerts appear in both WeChat and email.
Note: Emails are sent only when the status changes (OK→PROBLEM or PROBLEM→OK).
4.4 Monitoring Visualization
4.4.1 Aggregated Graphs
Latest Data → Graphs, then customize the name and add graphs to display.
4.4.2 Slideshows
Add a slideshow under Monitoring → Composite Graph → Slideshow.
4.5 Template Sharing
4.5.1 Host Sharing
Select all hosts, click Export, then import on another Zabbix server.
4.5.2 Template Sharing
Community repository: https://github.com/zhangyao8/zabbix-community-repos
5. Monitoring the Entire Server Fleet
5.1 Requirement
The company has 100 servers that need to be fully monitored with Zabbix.
5.2 Planning
Standard monitoring includes CPU, memory, disk, and network. To add many hosts quickly, consider cloning, auto‑registration, auto‑discovery, or using the Zabbix API (curl, Python).
Method 1: Clone existing host configuration.
Method 2: Auto‑registration and auto‑discovery.
Method 3: Call Zabbix API with curl or Python.
5.3 Specific Implementation
5.3.1 Hardware, System, Network Monitoring
Monitor all virtual machines, switches, routers (by monitoring port traffic or SNMP).
5.3.2 Application Service Monitoring
Backup server – monitor rsync port 873 or custom scripts.
NFS server – monitor RPC port 111 or use
showmount -e.
MySQL – monitor port 3306 or use Zabbix MySQL template.
Web servers – monitor port 80 or use Zabbix web checks.
URL monitoring – use Zabbix web scenario.
Reverse proxy, PPTP, NTP (port 123) – monitor respective ports.
Nginx – monitor the seven connection states via custom keys.
5.3.3 Generic Monitoring Methods
Monitor ports with
netstat,
ss,
lsof→
wc -l.
Monitor processes with
ps -ef | grep process | wc -l.
Simulate client usage:
curlfor HTTP,
select/insertfor MySQL,
set/getfor Memcached.
5.4 Full‑Network Monitoring Deployment
Deploy client scripts for CentOS 6.
5.4.1 Use Auto‑Discovery Rules
Add an auto‑discovery rule and create a discovery action.
5.4.2 Monitor Backup Server
Create a template using the built‑in key
net.tcp.listen[port].
5.4.3 Monitor NFS Server
Create an NFS template using
proc.num[,,,nfs]to count NFS processes.
5.4.4 Monitor MySQL Server
Add MySQL credentials to the built‑in MySQL key, then monitor the port.
Test with
zabbix_get -s 172.16.1.51 -p 10050 -k "net.tcp.port[,3306]":
<code><span># zabbix_get -s 172.16.1.51 -p 10050 -k "net.tcp.port[,3306]"</span>
<span># Returns 1 if the TCP connection can be established, 0 otherwise.</span></code>5.4.5 Monitor Web Servers (nginx)
Create a template for nginx and port 80.
<code>proc.num[<name>,<user>,<state>,<cmdline>]
net.tcp.port[<ip>,port]
</code>Test:
<code># zabbix_get -s 172.16.1.8 -p 10050 -k "proc.num[,,,nginx]"
# zabbix_get -s 172.16.1.8 -p 10050 -k "net.tcp.port[,80]"
</code>5.4.6 Monitor URL
Create a simple HTML page that returns "ok" and add a web scenario.
<code>echo ok >> /application/nginx/html/www/check.html</code>Test with:
<code>for ip in 7 8 9; do curl 10.0.0.$ip/check.html; done</code>5.4.7 Monitor Reverse Proxy (PPTP, etc.)
Create a custom user‑parameter:
<code>UserParameter=keep-ip,ip a | grep 10.0.0.3 | wc -l</code>Test:
<code># zabbix_get -s 172.16.1.5 -p 10050 -k "keep-ip"
# zabbix_get -s 172.16.1.6 -p 10050 -k "keep-ip"
</code>5.4.8 Monitor Nginx Connection States
Add
stub_statusto nginx config:
<code>location /status {
stub_status on;
access_log off;
}
</code>Define custom keys:
<code>UserParameter=nginx_active,curl -s 127.0.0.1/status | awk '/Active/ {print $NF}'
UserParameter=nginx_accepts,curl -s 127.0.0.1/status | awk 'NR==3 {print $1}'
UserParameter=nginx_handled,curl -s 127.0.0.1/status | awk 'NR==3 {print $2}'
UserParameter=nginx_requests,curl -s 127.0.0.1/status | awk 'NR==3 {print $3}'
UserParameter=nginx_reading,curl -s 127.0.0.1/status | awk 'NR==4 {print $2}'
UserParameter=nginx_writing,curl -s 127.0.0.1/status | awk 'NR==4 {print $4}'
UserParameter=nginx_waiting,curl -s 127.0.0.1/status | awk 'NR==4 {print $6}'
</code>Test on servers:
<code># zabbix_get -s 172.16.1.7 -p 10050 -k "nginx_waiting"
# zabbix_get -s 172.16.1.8 -p 10050 -k "nginx_waiting"
# zabbix_get -s 172.16.1.9 -p 10050 -k "nginx_waiting"
</code>Add the keys to a template, create graphs, and link the template to hosts.
6. Automatic Discovery and Registration
6.1 Overview
Automatic discovery: Zabbix Server actively scans the network and registers clients (high load on server). Automatic registration: Zabbix agents push their information to the server (agent must know server address).
6.2 Passive Auto‑Discovery
Milestones: install Zabbix Server, configure agents (Server=172.16.1.61), then set up discovery rules in the web UI (Configuration → Discovery → Local network).
Adjust IP range and delay as needed, then create a discovery action to add and enable discovered hosts.
7. Distributed Monitoring and SNMP
7.1 Distributed Monitoring
Use Zabbix proxies to offload load and monitor multiple data‑centers.
Architecture: Zabbix Server → Zabbix Proxy → Agents (per subnet).
Install Zabbix proxy on a separate host, set up a MySQL database for the proxy, configure
Server=172.16.1.61and
Hostname=cache01in
/etc/zabbix/zabbix_proxy.conf, then start the service.
Update agents to point to the proxy (Server=172.16.1.21) and restart them.
7.2 SNMP Monitoring
SNMP is used for devices that cannot run a Zabbix agent.
Install SNMP utilities:
<code>yum -y install net-snmp net-snmp-utils</code>Enable the system view in
/etc/snmp/snmpd.confand start the service.
<code>sed -i '57a view systemview included .1' /etc/snmp/snmpd.conf
systemctl start snmpd.service
</code>Test with:
<code>snmpwalk -v 2c -c public 127.0.0.1 sysName
</code>In Zabbix UI, add a new host, select SNMP interface, and apply an SNMP template.
8. Additional Notices
The tutorial was originally authored by "惨绿少年" and includes promotional material for upcoming events and submission invitations, which have been omitted from the technical content.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.