How to Prevent PostgreSQL XID Wraparound and Fix Freeze Errors
This article explains the cause of PostgreSQL "database is not accepting commands to avoid wraparound data loss" errors, details the XID allocation and wraparound mechanisms, and provides manual and automated vacuum‑freeze solutions with concrete SQL commands and scripts.
Background: A customer reported that an RDS PostgreSQL instance could not write data, receiving the error "ERROR: database is not accepting commands to avoid wraparound data loss" with a hint to stop the postmaster and run vacuum in single‑user mode. The issue was resolved by RDS engineers, prompting a deeper analysis.
XID Fundamentals
An XID (Transaction ID) is a 32‑bit unsigned integer assigned to every transaction. Each tuple stores the XID of the inserting or deleting transaction, enabling PostgreSQL to provide consistent reads. In repeatable‑read isolation, a transaction sees only tuples with t_xmin <= XID.
typedef uint32 TransactionId; /* 事务号定义,32位无符号整数 */
typedef struct HeapTupleFields {
TransactionId t_xmin; /* 插入该元组的事务号 */
TransactionId t_xmax; /* 删除或锁定该元组的事务号 */
} HeapTupleFields;XIDs range from 0 to 2^32‑1. When the maximum is reached, the counter wraps around to the beginning, but three special values (0, 1, 2) are reserved for Invalid, Bootstrap, and Frozen transaction IDs.
#define InvalidTransactionId ((TransactionId) 0)
#define BootstrapTransactionId ((TransactionId) 1)
#define FrozenTransactionId ((TransactionId) 2)
#define FirstNormalTransactionId ((TransactionId) 3)XID Wraparound Mechanism
PostgreSQL compares two XIDs using a signed 32‑bit subtraction:
bool TransactionIdPrecedes(TransactionId id1, TransactionId id2) {
int32 diff;
if (!TransactionIdIsNormal(id1) || !TransactionIdIsNormal(id2))
return (id1 < id2);
diff = (int32) (id1 - id2);
return (diff < 0);
}If the difference between the newest and oldest XID exceeds 2^31‑1, the comparison overflows, causing wraparound and making older tuples invisible to newer transactions.
Preventing Wraparound (XID Freeze)
The kernel periodically freezes old tuples by setting their XID to FrozenTransactionId (2), making them visible to all transactions and reclaiming XIDs. This can be triggered automatically by autovacuum or manually via SQL.
Check database or table age (difference between newest and oldest XID):
SELECT datname, age(datfrozenxid) FROM pg_database;
SELECT c.oid::regclass AS table_name,
greatest(age(c.relfrozenxid), age(t.relfrozenxid)) AS age
FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid
WHERE c.relkind IN ('r','m')
ORDER BY age DESC;
SELECT oid::regclass, age(relfrozenxid) FROM pg_class
WHERE oid='schema.table'::regclass::oid;Manually freeze a table: VACUUM FREEZE table_name; Manually freeze an entire database:
vacuumdb -d db_name --freeze --jobs=30 -h host -p port -U ownerThe freeze operation is I/O‑intensive; it should run during low‑traffic periods.
Solution Workflow
When the remaining XIDs drop to about one million ( 2^31‑1‑1,000,000), PostgreSQL stops accepting writes and raises the wraparound error. The remediation steps are:
Identify the database with the greatest age using the first SQL query.
Identify the tables with the greatest age using the second query and run VACUUM FREEZE on each.
Optionally automate the process with scripts.
Automation Scripts
Single‑process shell script to vacuum the top 50 oldest tables in a database:
for cmd in `psql -U user -p port -h host -d db \
-c "SELECT 'vacuum freeze '||c.oid::regclass||';' AS vacuum_cmd \
FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid \
WHERE c.relkind IN ('r','m') \
ORDER BY greatest(age(c.relfrozenxid),age(t.relfrozenxid)) DESC \
OFFSET 0 LIMIT 50;" | grep -v vacuum_cmd | grep -v row | grep vacuum`; do
psql -U user -p port -h host -d db -c "$cmd"
doneMulti‑process Python script using multiprocessing.Pool to vacuum the top 1000 tables with 32 concurrent workers:
from multiprocessing import Pool
import psycopg2
args = dict(host='pgm-bp10xxxx.pg.rds.aliyuncs.com', port=5432,
dbname='db_name', user='user', password='pwd')
def vacuum_handler(sql):
try:
conn = psycopg2.connect(**args)
cur = conn.cursor()
cur.execute(sql)
conn.commit()
conn.close()
except Exception as e:
print(e)
def multi_vacuum():
pool = Pool(processes=32)
sql_str = "SELECT 'vacuum freeze '||c.oid::regclass||';' AS vacuum_cmd "
sql_str += "FROM pg_class c LEFT JOIN pg_class t ON c.reltoastrelid = t.oid "
sql_str += "WHERE c.relkind IN ('r','m') "
sql_str += "ORDER BY greatest(age(c.relfrozenxid),age(t.relfrozenxid)) DESC LIMIT 1000;"
conn = psycopg2.connect(**args)
cur = conn.cursor()
cur.execute(sql_str)
rows = cur.fetchall()
for row in rows:
pool.apply_async(vacuum_handler, (row['vacuum_cmd'],))
pool.close()
pool.join()
conn.close()
multi_vacuum()Important Reminder
Vacuum‑freeze scans every page of a table and updates tuple headers, consuming significant I/O; control the concurrency level to avoid overwhelming the instance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
