Databases 17 min read

Evaluation of OceanBase Arbitration Service in a 2F1A Deployment: Fault Injection Experiments and Recovery Procedures

This article presents a detailed experimental study of OceanBase's Arbitration Service in a 2F1A (two full‑function replicas plus one arbitration node) configuration, examining how the system behaves when one or both full‑function replicas fail, how log‑stream degradation and permanent offline mechanisms work, and how normal service is restored after node recovery.

Aikesheng Open Source Community

Jun 27, 2024

Evaluation of OceanBase Arbitration Service in a 2F1A Deployment: Fault Injection Experiments and Recovery Procedures

Background

In distributed databases, when more than half of the data replicas become unavailable, an external arbitration service can participate in leader election and member changes to restore service. OceanBase version 4.1.0 introduced such an Arbitration Service.

A customer wanted to evaluate the 2F1A architecture (two full‑function replicas and one arbitration node) for cost savings.

Cluster layout: 1‑1‑1

Replica type: 2F1A (2 full‑function replicas + 1 arbitration node)

Key questions:

Can the tenant read/write normally after a single full‑function replica (leader) fails?

Can the cluster recover after both full‑function replicas experience recoverable failures and are permanently taken offline?

Key Terminology

OceanBase Arbitration Service

The arbitration service runs as a lightweight observer process independent of the main cluster, participating only in election and membership voting without storing logs or data.

Does not store logs, has no MemTable or SSTable.

Cannot become the primary.

When half of the full‑function replicas fail, the service degrades the affected log stream, removing the faulty replica from the member list and achieving RPO = 0. When the replica recovers, the service upgrades the log stream back.

Log Stream

A log stream is an entity created by OceanBase that groups tablets and ordered redo logs, providing multi‑replica synchronization via Paxos.

Log‑Stream Degradation

If a replica does not acknowledge log confirmations within arbitration_timeout (default 5 s), the arbitration service checks the replica and may trigger log‑stream degradation.

Degradation occurs only when the number of faulty replicas equals half of the total full‑function replicas. For example, in a 4F1A setup: 1 faulty F replica → no degradation. 2 faulty F replicas → degradation. 3 or more faulty F replicas → no quorum, degradation cannot run.

Environment Information

CentOS 7.5.1804

OCP cloud platform 4.2.0

OceanBase 4.2.1.4

Tenant: mysql_ob (MySQL mode, 1.5C6G)

Experiment Procedure

Pre‑failure Checks

Query tenant log streams:

select tenant_id,ls_id from oceanbase.CDB_OB_TABLET_TO_LS where tenant_id = 1002 group by LS_ID;

Check status of ordinary replica nodes: SELECT * FROM oceanbase.DBA_OB_SERVERS; Identify leader/follower roles for tenant mysql_ob.

select b.tenant_name,a.tenant_id,a.ls_id,a.zone,a.svr_ip,a.role from cdb_ob_table_locations a join __all_tenant b on a.tenant_id = b.tenant_id where a.tenant_id = 1002 group by role;

Verify arbitration node (node 164) is ACTIVE. SELECT * FROM DBA_OB_ARBITRATION_SERVICE; Check arbitration service status for the tenant.

SELECT TENANT_ID,TENANT_NAME,PRIMARY_ZONE,STATUS,TENANT_ROLE,SWITCHOVER_STATUS,ARBITRATION_SERVICE_STATUS FROM DBA_OB_TENANTS;

Show current server_permanent_offline_time and set it to 60 s for the test.

show parameters like 'server_permanent_offline_time';
ALTER SYSTEM SET server_permanent_offline_time='60s';

Fault Injection – First Failure

Kill the leader full‑function replica (node 161) using:

ps -ef | grep observer | grep -v "grep"
date && kill -9 $(ps aux | grep "observer" | grep -v "grep" | awk '{print $2}') && ps -ef | grep observer | grep -v grep && date

Observe that insert and select scripts stop writing/reading for about 5 seconds, the leader switches to node 163, and the arbitration service degrades the log stream of node 161.

Fault Injection – Second Failure

Kill the second full‑function replica (node 163) with the same command. Both full‑function replicas become INACTIVE, the arbitration node loses connectivity to them, and the tenant enters a no‑master state.

Post‑Failure Observation

Insert and select scripts report errors.

Leader/follower role query shows no active leader.

Log‑stream degradation entries appear in DBA_OB_SERVER_EVENT_HISTORY.

Both nodes are marked PERMANENT_OFFLINE after the configured 60 s.

Recovery Procedure

Determine which OBServer node failed later by checking the last line of observer.log and start that node first (node 163).

Start node 163:

su - admin
cd /home/admin/oceanbase
date && ./bin/observer
ps -ef | grep observer | grep -v "grep"

Verify insert and select scripts resume normal operation.

Confirm node 163 status is ACTIVE via SELECT * FROM oceanbase.DBA_OB_SERVERS; Start node 161 using the same steps; confirm it becomes ACTIVE.

Check data completeness with oceanbase.DBA_OB_UNIT_JOBS and verify the test table time_table contains the expected rows.

Restore server_permanent_offline_time to its original value (3600 s).

Timeline Summary

Set server_permanent_offline_time to 60 s.

15:53:39 – Kill node 161 (leader).

15:53:39‑15:53:43 – Brief read/write outage.

15:53:43 – Leader switches to node 163.

15:53:43 – Log‑stream degradation for node 161.

15:54:39 – Node 161 marked permanent offline.

16:04:28 – Kill node 163.

16:04:28 – Tenant experiences read/write errors; both replicas INACTIVE.

16:05:35 – Node 163 marked permanent offline; tenant has no master.

16:22:53 – Start node 163; service recovers.

16:24:55 – Tenant read/write normal.

16:36:29 – Start node 161; both replicas ACTIVE.

Restore server_permanent_offline_time to 3600 s.

Conclusion

When a single full‑function replica (leader) fails, the tenant experiences a short read/write interruption but recovers automatically.

When both full‑function replicas fail, the tenant becomes unavailable but can be fully restored after restarting the OBServer nodes.

In extreme cases where the arbitration node cannot store redo logs, data loss may occur if a second replica fails before the first recovers.

For cost‑sensitive scenarios tolerating possible data loss, the 2F1A arbitration HA scheme is viable; otherwise, a full‑function replica HA scheme is recommended.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL High Availability distributed database Fault Injection OceanBase Recovery Arbitration Service

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.