How to Build a High‑Availability RabbitMQ Cluster with Load Balancing
This guide explains the principles behind RabbitMQ clustering, shows how metadata synchronization works, compares design choices, and provides step‑by‑step instructions—including component installation, node configuration, HAProxy load‑balancing setup, and a sample architecture diagram—to create a reliable, scalable RabbitMQ cluster for production use.
RabbitMQ clustering principles
RabbitMQ is built on Erlang, which provides native distributed capabilities via the Erlang cookie. A RabbitMQ cluster synchronises four kinds of metadata—queues, exchanges, bindings and vhosts—across all nodes. The actual message payload of a queue resides only on the node that owns the queue.
Synchronising full queue data on every node would explode disk usage and increase network I/O for each publish, especially for persistent messages.
Message flow
When a client connects to the node that owns a queue (Scenario 1), publish/consume operations are performed locally. If the client connects to a different node (Scenario 2), the node uses the synchronised metadata to forward the request to the owner node, which stores the message.
Required software per VM
JDK 1.8
Erlang runtime (e.g., otpsrc19.3.tar.gz)
RabbitMQ server (e.g., rabbitmq-server-generic-unix-3.6.10.tar.gz)
Step‑by‑step 10‑node cluster setup
Edit the Erlang cookie ( /var/lib/rabbitmq/.erlang.cookie or $HOME/.erlang.cookie ) on every host so that all nodes share the same value.
Update /etc/hosts with the hostnames of all nodes, for example:
10.0.0.1 rmq-broker-test-1
10.0.0.2 rmq-broker-test-2
...
10.0.0.10 rmq-broker-test-10Start the RabbitMQ service on each machine: rabbitmq-server detached Verify the node status:
rabbitmqctl status
rabbitmqctl cluster_statusJoin the remaining nodes to the cluster (using node 1 as the reference node). For each node run:
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl join_cluster rabbit@rmq-broker-test-1 --ram # add “--ram” for RAM nodes
rabbitmqctl start_appRepeat, changing the target node if a different master is required.
Configure node types. Disk nodes store metadata on disk and are required for quorum; RAM nodes keep metadata in memory for lower latency. At least two disk nodes must be present for HA.
Confirm the final composition: rabbitmqctl cluster_status The output should list three disk nodes and seven RAM nodes.
Changing node type
When adding a node, use the --ram flag. Existing nodes can be switched with:
rabbitmqctl change_cluster_node_type disc # or ramHAProxy TCP load‑balancing for RabbitMQ
HAProxy can distribute client connections across the RAM nodes while performing health checks. A minimal configuration is shown below (replace ip1…ip7 with the actual node IPs):
global
log 127.0.0.1 local0 info
maxconn 4096
daemon
defaults
log global
mode tcp
option tcplog
option dontlognull
retries 3
maxconn 2000
timeout connect 5s
timeout client 120s
timeout server 120s
listen rabbitmq_cluster
bind 0.0.0.0:5672
balance roundrobin
server rmq_node1 ip1:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node2 ip2:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node3 ip3:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node4 ip4:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node5 ip5:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node6 ip6:5672 check inter 5000 rise 2 fall 3 weight 1
server rmq_node7 ip7:5672 check inter 5000 rise 2 fall 3 weight 1
listen monitor
bind 0.0.0.0:8100
mode http
stats enable
stats uri /stats
stats refresh 5sThe server lines define each RabbitMQ backend, its address, health‑check interval, rise/fall thresholds and weight for round‑robin distribution.
Resulting architecture
Seven RAM nodes (node 1‑7) serve client traffic via HAProxy.
Three disk nodes (node 8‑10) store cluster metadata and provide quorum.
HAProxy runs on a separate host and performs TCP load balancing and health monitoring.
Key take‑aways
RabbitMQ clusters synchronise only metadata to minimise storage overhead and network traffic.
Message payload stays on the owning node; other nodes act as routers using the metadata pointer.
At least two disk nodes are required for high availability; RAM nodes improve performance for read‑heavy workloads.
HAProxy provides a simple, production‑ready TCP load‑balancer that can be configured with a few lines of text.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
