Diagnosing Freeze and Dump Issues in OceanBase Memstore
This article explains how to verify whether OceanBase's freeze mechanism and memstore dump processes are functioning correctly by querying internal tables, inspecting logs, and checking related metrics such as active_memstore_used, major_freeze_trigger, and MemTable reference counts.
This guide, authored by OceanBase expert Cao Shengchun, explains step‑by‑step how to diagnose freeze and dump problems in OceanBase's memstore.
1. Check Freeze Status
1.1 Is the freeze function normal?
Query __all_virtual_tenant_memstore_info for a tenant (e.g., tenant_id = 1001 ) and compare active_memstore_used with major_freeze_trigger . If active_memstore_used <= major_freeze_trigger , the freeze function works; otherwise, examine freeze threads.
1.2 Are freeze threads running correctly?
Search observer.log for the string "tenant manager timer task". Normal output every 2 seconds indicates the freeze thread is alive. If the thread is healthy but active_memstore_used exceeds the trigger, identify MemTables that have not been frozen:
select a.table_name, b.table_id, b.partition_id, b.mt_base_version, b.mt_is_frozen, b.mt_protection_clock, b.mt_snapshot_version
from gv$table as a inner join __all_virtual_tenant_memstore_allocator_info as b
on a.table_id = b.table_id
where b.mt_is_frozen = 0 and b.svr_ip='10.166.117.126' and b.tenant_id = 1001
order by mt_protection_clock;Rows with mt_is_frozen = 0 and mt_protection_clock != 9223372036854775807 are potential freeze‑abnormal tables.
2. Check Dump Status
2.1 Is the dump function normal?
Again query __all_virtual_tenant_memstore_info and compare total_memstore_used with major_freeze_trigger . If total_memstore_used > major_freeze_trigger , the dump function is abnormal.
2.2 Are there active transactions?
For OB 2.2.x, run:
select * from __all_virtual_table_mgr as a
where a.table_type = 0 and a.is_active = 0 and a.trx_count > 0
and (a.table_id, a.partition_id) in (
select table_id, partition_id from __all_virtual_tenant_memstore_allocator_info
where svr_ip='10.166.117.126' and tenant_id=1001 and mt_is_frozen=1
);Any result indicates active transactions preventing dump.
2.3 Is weak‑consistency read timestamp behind snapshot version?
Run the following to find MemTables whose snapshot_version is greater than the minimum of transaction, replay, and log timestamps:
select a.svr_ip, a.table_id, a.partition_id, a.is_active, a.table_type, a.snapshot_version,
b.min_trans_service_ts, b.min_replay_engine_ts, b.min_log_service_ts
from __all_virtual_table_mgr as a inner join __all_virtual_partition_info as b
on a.table_id = b.table_id and a.partition_id = b.partition_idx and a.svr_ip = b.svr_ip
where a.table_type = 0 and a.is_active = 0
and a.snapshot_version > least(least(b.min_trans_service_ts, b.min_replay_engine_ts), b.min_log_service_ts)
and (a.table_id, a.partition_id) in (
select table_id, partition_id from __all_virtual_tenant_memstore_allocator_info
where svr_ip='10.166.117.126' and tenant_id=1001 and mt_is_frozen=1
);If rows are returned, the weak‑consistency timestamp lag may cause dump scheduling issues.
2.4 Confirm dump success
Locate the pkey ( table_id, partition_id ) of frozen MemTables and search observer.log for dump‑related messages (e.g., "add dag success", "task finish process"). In practice, dump logs are recorded per tenant, not per table.
2.5 Are MemTable reference counts normal?
Check write reference counts after dump:
select * from __all_virtual_table_mgr as a
where a.table_type = 0 and a.is_active = 0 and a.write_ref > 0
and (a.table_id, a.partition_id) in (
select table_id, partition_id from __all_virtual_tenant_memstore_allocator_info
where svr_ip='10.166.117.126' and tenant_id=1001 and mt_is_frozen=1
);Non‑zero write_ref indicates reference‑count anomalies that keep memory allocated.
3. Open Issues
Keywords for locating specific table transaction logs in observer.log .
Why a weak‑consistency timestamp smaller than snapshot_version leads to dump scheduling failure.
How to query dump logs for a single table when dumps are triggered per tenant.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.