Databases 8 min read

Managing Excessive WAL Growth in PostgreSQL 11: Diagnosis and Resolution

This article explains why a low‑traffic PostgreSQL 11 database can accumulate massive WAL files, analyzes relevant configuration parameters and replication slots, demonstrates diagnostic SQL commands, and provides step‑by‑step actions—including adjusting archive_timeout, handling replication slots, and forcing a checkpoint—to reclaim disk space.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Managing Excessive WAL Growth in PostgreSQL 11: Diagnosis and Resolution

The problem presented is a PostgreSQL 11 instance whose pg_wal directory and archive directory have grown beyond 80 GB despite minimal business activity, threatening to exhaust disk space.

Key PostgreSQL parameters affecting WAL size are examined: wal_segment_size (fixed at 16 MB), wal_keep_segments , min_wal_size , max_wal_size , and archive_timeout . Their meanings and default behaviors are described.

Additional causes of WAL accumulation are identified, such as failed archiving and active physical replication slots that retain WAL files.

Diagnostic queries are run to inspect the current settings and replication slot status:

postgres=# select version();
 version
-----------------------------------------
 PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

postgres=# show wal_keep_segments ;
 wal_keep_segments 
-------------------
 128

postgres=# show min_wal_size ;
 min_wal_size 
--------------
 80MB

postgres=# show max_wal_size ;
 max_wal_size 
--------------
 5GB

postgres=# show archive_timeout;
 archive_timeout 
-----------------
 1min

postgres=# select * from pg_stat_get_archiver();
-[ RECORD 1 ]------+------------------------------
archived_count     | 6053
last_archived_wal  | 00000001 00000017 000000A5
last_archived_time | 2020-09-30 11:51:04.366339+08
failed_count       | 0
...

The output shows wal_keep_segments set to 128 and a very short archive_timeout , explaining rapid WAL creation even with little activity. The recommendation is to increase archive_timeout (e.g., to 10 minutes) to reduce forced WAL switches.

Further investigation reveals a physical replication slot pgstandby1 that is inactive ( active = f ) but still retaining old WAL files. The oldest WAL file corresponds to the slot’s restart_lsn , confirming the slot as the primary cause of the accumulation.

postgres=# select * from pg_replication_slots;
-[ RECORD 1 ]-------+-----------
slot_name           | pgstandby1
plugin              | 
slot_type           | physical
active              | f
restart_lsn         | 0/B000000
...

postgres=# select pg_walfile_name('0/B000000');
 pg_walfile_name | 00000001000000000000000A

Since the slot is unused, it can be dropped:

postgres=# select pg_drop_replication_slot('pgstandby1');
 pg_drop_replication_slot
--------------------------
 (1 row)

After adjusting archive_timeout and removing the unnecessary replication slot, a manual checkpoint is issued to force WAL recycling:

postgres=# checkpoint;
CHECKPOINT

$ du -sh .
2G    .

The conclusion summarizes three main points: (1) even idle databases generate WAL due to forced switches; increase archive_timeout to slow this down. (2) PostgreSQL parameters are soft limits; unchecked growth can fill disks and requires manual intervention. (3) Unused replication slots are a common hidden cause of WAL bloat, so monitoring both WAL size and slot status is essential for DBA health.

performance tuningPostgreSQLWALDatabase AdministrationReplication Slots
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.