Databases 16 min read

Performance Degradation After Data Updates in OceanBase and Its Optimization Techniques

The article investigates why pure‑read QPS drops significantly after bulk updates in OceanBase, reproduces the issue with a sysbench workload, analyses flame‑graph and SQL audit data, explains the LSM‑Tree read‑amplification mechanism, and proposes practical mitigation steps such as major freeze, plan binding, index creation, and the queuing‑table feature.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Performance Degradation After Data Updates in OceanBase and Its Optimization Techniques

Background – During a pure‑read performance test on OceanBase, the author observed a noticeable QPS decline after performing update operations. The article details the reproduction steps and the environment used for the experiment.

Reproduction Steps

Environment Preparation

Deploy OB

Deploy a single‑node OceanBase using OBD.

Version

IP

OceanBase 4.0.0.0 CE

10.186.16.122

Key parameters (default values) relevant to the experiment:

Parameter Name

Meaning

Default

memstore_limit_percentage

Percentage of tenant memory used by memstore

50

freeze_trigger_percentage

Memory threshold that triggers a global freeze

20

major_compact_trigger

Number of minor merges before a major merge

0

minor_compact_trigger

Threshold of Mini SSTable count that triggers a minor merge

2

Create sysbench tenant

create resource unit sysbench_unit max_cpu 26, memory_size '21g';
create resource pool sysbench_pool unit = 'sysbench_unit', unit_num = 1, zone_list=('zone1');
create tenant sysbench_tenant resource_pool_list=('sysbench_pool'), charset=utf8mb4, zone_list=('zone1'), primary_zone=RANDOM set variables ob_compatibility_mode='mysql', ob_tcp_invited_nodes='%';

Data preparation

Create 30 tables, each with 1,000,000 rows:

sysbench ./oltp_read_only.lua --mysql-host=10.186.16.122 --mysql-port=12881 --mysql-db=sysbenchdb --mysql-user="sysbench@sysbench_tenant" --mysql-password=sysbench --tables=30 --table_size=1000000 --threads=256 --time=60 --report-interval=10 --db-driver=mysql --db-ps-mode=disable --skip-trx=on --mysql-ignore-errors=6002,6004,4012,2013,4016,1062 prepare

Environment tuning – manually trigger a major merge

ALTER SYSTEM MAJOR FREEZE TENANT=ALL;
-- View merge progress
SELECT * FROM oceanbase.CDB_OB_ZONE_MAJOR_COMPACTION\G

Pure‑read QPS before data updates

sysbench ./oltp_read_only.lua ... run

Five runs produced QPS values around 340k.

Pure‑read QPS after write‑only workload (updates/deletes/inserts)

sysbench ./oltp_write_only.lua ... run

Five runs dropped to roughly 170k, a ~50% reduction.

Pure‑read QPS after a manual major merge

ALTER SYSTEM MAJOR FREEZE TENANT=ALL;

QPS recovered to the original level (≈330k‑350k).

Phenomenon Summary

Bulk updates cause a ~170k QPS drop, which is restored after a major merge.

Investigation Process

Method 1 – Flame Graph

Flame‑graph comparison before and after updates shows additional hot spots in oceanbase::blocksstable::ObMultiVersionMicroBlockRowScanner::inner_get_next_row , indicating heavy multi‑version row processing.

Further source inspection reveals the function ObMultiVersionMicroBlockRowScanner::inner_get_next_row_impl performs a while‑loop that reads and merges multiple versions of a row, invoking do_compact and fuse_row . The code snippet is reproduced below:

int ObMultiVersionMicroBlockRowScanner::inner_get_next_row_impl(const ObDatumRow *&ret_row)
{
  int ret = OB_SUCCESS;
  bool final_result = false;
  bool found_first_row = false;
  bool have_uncommited_row = false;
  const ObDatumRow *multi_version_row = NULL;
  ret_row = NULL;
  while (OB_SUCC(ret)) {
    // locate cursor
    if (OB_FAIL(locate_cursor_to_read(found_first_row))) {
      if (OB_UNLIKELY(OB_ITER_END != ret)) {
        LOG_WARN("failed to locate cursor to read", K(ret));
      }
    }
    while (OB_SUCC(ret)) {
      multi_version_row = NULL;
      bool version_fit = false;
      if (read_row_direct_flag_) {
        if (OB_FAIL(inner_get_next_row_directly(multi_version_row, version_fit, final_result))) {
          if (OB_UNLIKELY(OB_ITER_END != ret)) {
            LOG_WARN("failed to inner get next row directly", K(ret));
          }
        }
      } else if (OB_FAIL(inner_inner_get_next_row(multi_version_row, version_fit, final_result, have_uncommited_row))) {
        if (OB_UNLIKELY(OB_ITER_END != ret)) {
          LOG_WARN("failed to inner get next row", K(ret));
        }
      }
      if (OB_SUCC(ret)) {
        if (!version_fit) {
          // do nothing
        } else if (OB_FAIL(do_compact(multi_version_row, row_, final_result))) {
          LOG_WARN("failed to do compact", K(ret));
        } else {
          if (OB_NOT_NULL(context_)) {
            ++context_->table_store_stat_.physical_read_cnt_;
          }
          if (have_uncommited_row) {
            row_.set_have_uncommited_row();
          }
        }
        if ((OB_SUCC(ret) && final_result) || OB_ITER_END == ret) {
          ret = OB_SUCCESS;
          if (OB_FAIL(cache_cur_micro_row(found_first_row, final_result))) {
            LOG_WARN("failed to cache cur micro row", K(ret));
          }
          break;
        }
      }
    }
    if (OB_SUCC(ret) && finish_scanning_cur_rowkey_) {
      if (!is_row_empty(prev_micro_row_)) {
        ret_row = &prev_micro_row_;
      } else if (!is_row_empty(row_)) {
        ret_row = &row_;
      }
      if (NULL != ret_row) {
        (const_cast
(ret_row))->mvcc_row_flag_.set_uncommitted_row(false);
        const_cast
(ret_row)->trans_id_.reset();
        break;
      }
    }
  }
  if (OB_NOT_NULL(ret_row)) {
    if (!ret_row->is_valid()) {
      LOG_ERROR("row is invalid", KPC(ret_row));
    } else {
      LOG_DEBUG("row is valid", KPC(ret_row));
      if (OB_NOT_NULL(context_)) {
        ++context_->table_store_stat_.logical_read_cnt_;
      }
    }
  }
  return ret;
}

Method 2 – SQL Execution Audit

Using GV$OB_SQL_AUDIT , the author compared execution details of the same SELECT before and after the update workload. The audit shows increased MEMSTORE_READ_ROW_COUNT and SSSTORE_READ_ROW_COUNT (total ~233 rows read for a query that returns only 100 rows), confirming read‑amplification.

Plan analysis via V$OB_PLAN_CACHE_PLAN_EXPLAIN reveals the query uses a primary‑key index with a PHY_TABLE_SCAN followed by a PHY_SORT . The plan is already optimal, so the slowdown originates from the storage engine rather than the optimizer.

Conclusion

OceanBase’s LSM‑Tree storage stores data in a baseline plus incremental fashion. After massive insert/delete/update operations, a single logical row may have many physical versions spread across MemTables and SSTables. Reading such a row requires scanning and merging all versions, leading to significant read amplification and QPS degradation.

Performance Improvement Methods

For cases where a usable index exists but the optimizer still chooses a full‑table scan, use plan binding to force the optimal plan.

If the main filter column lacks an index, create an appropriate index online and bind the plan.

When indexes cannot be added or the workload is range‑scan heavy, manually trigger a major merge to clean up obsolete versions, reducing the amount of data scanned.

From OceanBase 2.2.7 onward, the “buffer minor merge” feature (setting table mode to 'queuing' ) can automatically handle queuing‑type tables, eliminating ineffective scans. Note that community edition 4.0 no longer supports explicit queuing tables, but the functionality is integrated adaptively in later 4.x releases.

References

"Queuing Table Slow Query Issue" – https://www.oceanbase.com/docs/enterprise-oceanbase-database-cn-10000000000945692

"Large‑Scale Data Processing Slow Access Problem Handling" – https://ask.oceanbase.com/t/topic/35602375

"OceanBase Queuing Table (buffer table) Best Practice" – https://open.oceanbase.com/blog/2239494400

"Is Queuing Table Unsupported in OB 4.0?" – https://ask.oceanbase.com/t/topic/35601606

LSM TreePerformance TuningFlame GraphOceanBaseMajor FreezeQueuing Table
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.