Databases 16 min read

Rebuilding an OceanBase Node Using the server_permanent_offline_time Parameter

This guide explains how to use the OceanBase server_permanent_offline_time parameter to permanently offline a faulty node, rebuild its data, and restore normal operation, including preparation, command steps, verification, and recommended settings for production.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Rebuilding an OceanBase Node Using the server_permanent_offline_time Parameter

The article describes a practical method for recovering a damaged or missing data file in an OceanBase cluster by adjusting the server_permanent_offline_time parameter, which controls how long a node must be offline before being marked permanently offline.

Principle

server_permanent_offline_time determines the timeout after which a crashed node is considered permanently offline. If the downtime exceeds the configured value, the node is removed from the Paxos replica group and its data is rebuilt on other nodes in the same zone. The default is 3600 seconds; lowering it accelerates permanent offline and subsequent data reconstruction.

Official Recommendations

Database version upgrade: set to 72 hours.

OBServer hardware replacement: set to 4 hours.

OBServer clean‑up scenario: set to 10 minutes.

Preparation

Deploy a three‑node OceanBase cluster with an OBProxy, create a tenant sysbench_tenant (primary_zone=RANDOM), and note the IPs:

oceanbase 3.1.2 10.186.64.74
10.186.64.75
10.186.64.79
OBProxy 3.2.3 10.186.60.3

Generate test data using sysbench:

sysbench ./oltp_insert.lua --mysql-host=10.186.60.3 --mysql-port=2883 --mysql-db=sysbenchdb --mysql-user="sysbench@sysbench_tenant" --mysql-password=sysbench --tables=1 --table_size=10000 --threads=1 --time=600 --report-interval=10 --db-driver=mysql --db-ps-mode=disable --skip-trx=on --mysql-ignore-errors=6002,6004,4012,2013,4016,1062,5157,4038 prepare

Experiment Steps

Continuously write data with sysbench to keep traffic.

Delete the data files on node 10.186.64.79 (zone3).

Reduce server_permanent_offline_time to 60 seconds:

Stop the node’s external service (ISOLATE or STOP SERVER):

Kill the observer process and wait for the permanent‑offline state, verifying via __all_rootservice_event_history .

Restart the observer process after clearing logs and sstable files, which triggers automatic data reconstruction.

When partition counts across zones become equal, start the node again:

Restore the original server_permanent_offline_time value (3600 seconds).

SQLDatabase RecoveryOceanBaseNode Rebuildserver_permanent_offline_time
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.