How to Extend Zabbix Monitoring Data Retention in Elasticsearch for a Year
Facing limited storage of Zabbix historical data in Elasticsearch, the article outlines a comprehensive strategy—expanding nodes, adding SSDs, redesigning index mapping, using hot‑cold node tiers, employing Curator for automated shrink, segment merging, and lifecycle management—to retain up to a year of monitoring data efficiently.
Scenario Analysis
Company stores Zabbix historical data in Elasticsearch. Currently three ES nodes store about 5 GB per day, retaining only one month; data older than 30 days is deleted. Each node has 8 GB RAM, mechanical disks, 5 primary shards and 1 replica. Queries usually need one‑week data, occasionally up to two months.
Node Planning
To extend retention and accommodate growth, the node count was increased to four, with some nodes upgraded with SSDs and increased memory.
Optimization Approach
Re‑model data mapping: avoid tokenizing string fields. Use a hot‑cold node architecture: hot nodes store recent data (first seven days) with 2 primary and 1 replica shards; older data moves to cold nodes, and after 30 days shards are set to 2 primary, 0 replica and shrunk via the
_shrinkAPI. Force‑merge yesterday’s index to a single segment, set the refresh interval to 60 s, and close indices older than three months. All operations are scheduled with Elasticsearch Curator.
Zabbix and Elasticsearch Integration
Modify Zabbix server configuration (
/etc/zabbix/zabbix_server.conf) to add the Elasticsearch address (any cluster node). Update Zabbix web configuration (
/etc/zabbix/web/zabbix.conf) accordingly.
Configure ES Nodes
Add tags for hot and cold nodes in
elasticsearch.yml:
<code># Example snippet
# hot node settings
# cold node settings
</code>Cold node configuration:
Create Templates and Pipelines
Define index templates for each data type based on the mapping file, specifying index patterns, shard counts, refresh interval, node allocation, and mapping. Example shown for uint and str types.
Pipelines preprocess data before indexing, creating daily indices.
Curator Operations
Install Curator and configure actions to:
Assign indices older than 7 days to cold nodes.
Force‑merge daily indices to a single segment.
Shrink indices older than 30 days (2 primary, 0 replica).
Close indices older than three months.
Delete indices older than one year.
Curator configuration files (action.yml) define these actions.
Optimized Results
Testing shows that after applying Curator, a day‑old index is moved to a cold node with only one primary shard, and each shard contains a single segment, confirming successful shrink and segment reduction.
Feedback from experts is welcomed.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.