Databases 7 min read

Master PostgreSQL High Availability with Pacemaker & Corosync: A Step‑by‑Step Guide

This tutorial walks through building a PostgreSQL high‑availability cluster using Pacemaker and Corosync, covering architecture overview, component installation, cluster status checks, data synchronization verification, failover handling, and common maintenance commands with concrete commands and screenshots.

ITPUB

May 24, 2024

Master PostgreSQL High Availability with Pacemaker & Corosync: A Step‑by‑Step Guide

Introduction

This article explains how to achieve high availability for PostgreSQL by combining Pacemaker and Corosync.

1. High‑Availability Architecture

Pacemaker manages resource migration, while Corosync provides heartbeat detection. Together they enable automatic management of a PostgreSQL HA cluster: if a node fails, Pacemaker transfers resources, and pcs serves as the configuration tool for both components.

2. Architecture Details

2.1 Pacemaker

Pacemaker is the most widely used open‑source Cluster Resource Manager on Linux. It leverages the underlying cluster infrastructure (Corosync or Heartbeat) to detect node and resource failures and to recover resources, ensuring maximal service availability.

Official site: https://clusterlabs.org/pacemaker/
GitHub: https://github.com/ClusterLabs/pacemaker

2.2 Corosync

Corosync provides group communication and heartbeat services. It detects when the primary service fails and instantly switches to a standby node, forming the foundation for HA when paired with Pacemaker.

2.3 pcs

pcs is the configuration tool for Corosync and Pacemaker. It allows users to view, modify, and create Pacemaker‑based clusters, managing resources through a remote daemon (pcsd) and a Web UI.

3. Cluster Status Checks

3.1 Restart Cluster

# pcs cluster start --all
# pcs cluster status
# pcs status corosync

3.2 Verify Cluster Status

# crm_mon -Afr -1

-- Check VIP mounting on node 1

-- Check VIP mounting on node 2

3.3 Primary‑Standby Replication Status

# su - postgres
psql
SELECT * FROM pg_stat_replication;

4. Data Synchronization Verification

# On primary node
psql -h 192.168.3.13 -p 5432 -U postgres
CREATE TABLE test(id int);
INSERT INTO test VALUES (1);
# Verify on standby node

5. Failover Procedure

5.1 Primary Failure

# pg_ctl stop -D /pgdata
# pcs status

5.2 Recovering the Failed Node

# Remove temporary lock file
rm -rf /var/lib/pgsql/tmp/PGSQL.lock
# Restart services (no need for pg_ctl start)
systemctl restart corosync pacemaker pcsd
# After node 1 restarts, it becomes a standby of node 3

6. Common Cluster Operations

pcs status                     # view cluster status
pcs resource show             # view resources
pcs resource cleanup          # clean failed resource state
pcs resource list             # list resources
pcs resource restart          # restart a resource
pcs resource enable           # enable a resource
pcs resource disable          # disable a resource
pcs resource delete           # delete a resource
crm_mon -Afr -1                # view synchronization status

Conclusion

The Pacemaker + Corosync high‑availability architecture ensures PostgreSQL continues to run smoothly even when hardware or software failures occur, delivering a stable and reliable database service.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

database High Availability cluster PostgreSQL Corosync Pacemaker

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.