Databases 8 min read

Mastering PostgreSQL Backup & Replication: A Complete Enterprise Guide

An in‑depth enterprise guide explains why backup and replication are critical for PostgreSQL, compares physical, logical, and logical replication methods, provides step‑by‑step command examples, outlines high‑availability architectures, automation scripts, disaster‑recovery procedures, monitoring queries, and common pitfalls to ensure robust data protection.

Ray's Galactic Tech
Ray's Galactic Tech
Ray's Galactic Tech
Mastering PostgreSQL Backup & Replication: A Complete Enterprise Guide

Why Backup and Replication Are Essential for PostgreSQL

Database failures such as accidental table deletions, disk crashes, network partitions, or failed upgrades can cause data loss or prolonged outages. Regular backups, recovery drills, and replication are required to achieve reliability and high availability.

Three Core PostgreSQL Data‑Protection Technologies

Physical Backup

Physical backup copies the entire PGDATA directory and relies on continuous WAL archiving for point‑in‑time recovery (PITR).

Copy the binary files of PGDATA directly.

Enable continuous WAL archiving.

Typical command: pg_basebackup with streaming WAL.

# Full physical backup compressed as tar.gz
pg_basebackup \
  -h 10.10.10.1 -p 5432 \
  -U backup_user \
  -D /data/backup/full_$(date +%F) \
  -Ft -z -P --wal-method=stream

Key postgresql.conf settings for WAL archiving:

archive_mode = on
archive_command = 'cp %p /data/wal_archive/%f'
wal_level = replica
max_wal_senders = 10
wal_keep_size = 4GB

Logical Backup

Logical backup extracts data at the database, schema, table, or DDL level, allowing fine‑grained restores.

Export a single database: pg_dump -d mydb -Fc -f mydb_$(date +%F).dump Parallel dump for large databases: pg_dump -d mydb -Fd -j 8 -f backup_dir_$(date +%F) Export all databases and roles:

pg_dumpall -U postgres > all_db_$(date +%F).sql

Logical Replication

Logical replication streams changes from a publisher to one or more subscribers.

Publisher → WAL logical decoding → Logical change → Subscriber

Typical setup:

# 1. Create a replication role on the primary
CREATE ROLE repl_user WITH REPLICATION LOGIN PASSWORD 'repl123';

# 2. Create a publication for selected tables
CREATE PUBLICATION pub_sales FOR TABLE sales, orders;

# 3. Create a subscription on the replica
CREATE SUBSCRIPTION sub_sales
  CONNECTION 'host=10.10.10.1 port=5432 dbname=mydb user=repl_user password=repl123'
  PUBLICATION pub_sales;

Choosing the Appropriate Technique

Full‑cluster disaster recovery → Physical backup + WAL archiving.

Single‑table accidental deletion → Logical backup.

Multi‑region traffic distribution → Logical replication.

Minimize data loss (RPO≈0) → Physical backup with continuous WAL.

Prevent accidental changes from propagating to all replicas → Delayed physical replication (hot‑standby feedback).

Typical Enterprise High‑Availability Architecture

Enterprise HA diagram
Enterprise HA diagram

Practical Enterprise Implementation

Hybrid Backup Strategy (YAML)

backup_strategy:
  physical:
    full_backup: "daily 02:00"
    retention: "30 days"
  wal:
    enabled: true
    retention: "90 days"
  logical:
    full_backup: "Sundays 01:00"
    retention: "12 months"
  replication:
    sync_node: 1
    async_read_only: 1
    delayed_node:
      delay: 1h

Automated Backup Script

#!/bin/bash
BACKUP_DIR="/backup"
DATE=$(date +%Y%m%d_%H%M%S)

# Physical backup
pg_basebackup -D ${BACKUP_DIR}/physical/${DATE} \
  -Ft -z -P --wal-method=fetch

# Logical backup
pg_dump -d mydb -Fd -j 4 -f ${BACKUP_DIR}/logical/${DATE}

# Verify backup integrity
pg_verifybackup ${BACKUP_DIR}/physical/${DATE}

# Clean up old backups (keep 30 days)
find ${BACKUP_DIR}/physical -mtime +30 -exec rm -rf {} \;

Disaster Recovery & Point‑In‑Time Recovery (PITR)

Recovering after an accidental DROP TABLE orders; operation:

# 1. Stop the database
pg_ctl stop -D /var/lib/pgsql/data

# 2. Restore the physical backup
rm -rf /var/lib/pgsql/data/*
 tar -xf /backup/full_2024-12-01.tar.gz -C /var/lib/pgsql/data

# 3. Create recovery configuration for the target time
echo "restore_command = 'cp /data/wal_archive/%f %p'" >> postgresql.conf
echo "recovery_target_time = '2024-12-20 14:02:00'" >> /var/lib/pgsql/data/recovery.conf

# 4. Start the database – it will roll forward to the point before the DROP
pg_ctl start -D /var/lib/pgsql/data

Replication Monitoring & Lag Diagnosis

SELECT
  client_addr,
  state,
  write_lag,
  flush_lag,
  replay_lag,
  pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS bytes_delay
FROM pg_stat_replication;

If replication lag exceeds 1 GB, consider expanding network bandwidth, increasing max_wal_senders, or adding additional archive storage.

Common Pitfalls and Fixes

Relying only on logical backups – may not support full cluster recovery. Fix: Combine physical and logical backups.

Not validating backups – restores can fail. Fix: Perform quarterly restore drills.

WAL archive disk full – primary stops writing and may crash. Fix: Implement automatic cleanup and off‑site cold storage (e.g., S3).

Key Takeaway Formula

Enterprise‑grade data safety =
  Physical backup +
  WAL archiving +
  Logical backup +
  Real‑time replication +
  Delayed replica +
  Regular recovery drills
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilitydisaster recoveryReplicationPostgreSQL
Ray's Galactic Tech
Written by

Ray's Galactic Tech

Practice together, never alone. We cover programming languages, development tools, learning methods, and pitfall notes. We simplify complex topics, guiding you from beginner to advanced. Weekly practical content—let's grow together!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.