Boost Elasticsearch Performance with Hot‑Cold Data Node Separation
This article explains how to configure Elasticsearch nodes for hot and cold data, assign special node attributes, adjust index templates, and use API calls to migrate data, demonstrating significant query speed improvements through real‑world performance tests.
ES Node Types and Their Roles
Elasticsearch clusters consist of three primary node roles:
Client node : Handles request routing, search, and index operations without storing data. It acts as a smart load balancer. Configuration example:
node.master: false
node.data: falseData node : Stores index data and performs CRUD and aggregation operations. It requires higher CPU, memory, and I/O resources. Configuration example:
node.master: false
node.data: trueMaster node : Manages cluster-wide operations such as creating or deleting indices and shard allocation. Keeping master nodes separate from data nodes improves stability. Configuration example:
node.master: true
node.data: falseWhen deploying, assign servers based on these roles: ordinary servers for master nodes, optionally more memory for client nodes if they handle heavy aggregations, and high‑performance servers for data nodes.
Special Node Identities: Hot and Cold Data
Beyond basic roles, Elasticsearch allows assigning custom attributes to nodes to separate hot (frequently accessed) and cold (rarely accessed) data. This enables allocating resources where they matter most.
Configure node attributes in elasticsearch.yml:
# Hot data node
node.attr.box_type: hot # Cold data node
node.attr.box_type: coldDefine an index template that forces hot‑data indices to allocate shards to hot nodes:
{
"template": "hodo-stats-*",
"settings": {
"index": {
"refresh_interval": "5s",
"number_of_shards": "2",
"routing.allocation.require.box_type": "hot"
}
}
}Newly created indices matching the template will store documents on hot nodes. To move older data to cold nodes, update the index settings via the REST API:
PUT http://10.10.43.13:9200/hodo-history-2021-01/_settings
{
"index.routing.allocation.require.box_type": "cold"
}After the setting change, Elasticsearch automatically migrates the shards to the designated cold nodes without manual intervention.
Performance Test Results
The following table compares query times for different environments and data volumes.
Environment
Index Name
Data Size
Query Time
Cloud Environment
fire-history-2019-09
540,666,628 (33.0 GB)
6.139 seconds
Cloud Environment
fire-history-2019-11
682,814,475 (42.0 GB)
8.41 seconds
Local Hot Node (SSD)
hodo-history-2021-01
324,884,953 (31.2 GB)
0.143 seconds
Local Hot Node (SSD)
hodo-history-2021-01
612,670,000 (58.2 GB)
0.326 seconds
Local Cold Node (HDD)
hodo-history-2021-01
612,670,000 (58.2 GB)
3.885 seconds
The results show that hot‑node SSD storage yields query times an order of magnitude faster than cold‑node HDD storage.
By strategically assigning hot and cold data to appropriate nodes, organizations can achieve substantial performance gains while controlling infrastructure costs.
Architect's Alchemy Furnace
A comprehensive platform that combines Java development and architecture design, guaranteeing 100% original content. We explore the essence and philosophy of architecture and provide professional technical articles for aspiring architects.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
