Databases 11 min read

Techniques for Measuring Data Size in OceanBase at Server, Tenant, and Table Levels

This article explains how to accurately calculate the real data size in OceanBase by using OCP cluster overview, system views, and SQL queries for server‑level, tenant‑level, and table‑level statistics, covering differences between V3 and V4 versions and the use of obdiag.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Techniques for Measuring Data Size in OceanBase at Server, Tenant, and Table Levels

OceanBase is a multi‑tenant distributed database, and when measuring data size one must consider multiple tenants and replicas on a single server, as well as the fact that data files on disk are pre‑allocated.

1 Server‑level Statistics

The OCP "Cluster Overview" shows each OBServer's total and used disk capacity, allowing a quick view of overall cluster usage. For example, a server displaying 2.5TB/16.8TB means the pre‑allocated datafile_size (or datafile_disk_percentage) is 16.8 TB, while the actual data stored in block_file is 2.5 TB.

Disk usage can also be obtained directly from the OS:

df -h |grep /data/1
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/observervg-obdata   24T   17T  7.1T  71% /data/1

ll /data/1/group_x86_pt01/sstable/
total 17T
-rw-r----- 1 admin admin 17T Jan 18 01:31 block_file

The view __all_virtual_disk_stat provides the same information without OCP. The following query works on both V3 and V4 (run in the sys tenant):

SELECT
  svr_ip,
  svr_port,
  total_size / 1024 / 1024 / 1024 AS total_size_GB,
  free_size  / 1024 / 1024 / 1024 AS free_size_GB,
  (total_size - free_size) / 1024 / 1024 / 1024 AS used_size_GB
FROM __all_virtual_disk_stat;

2 Tenant‑level Statistics

Tenant‑level statistics are more meaningful for data‑size analysis because a tenant corresponds to a MySQL/Oracle instance and may have multiple replicas across zones.

For V3 (run in sys tenant):

select
  zone,
  svr_ip,
  svr_port,
  sum(data_size) / 1024 / 1024 / 1024 as data_size_gb,
  sum(required_size) / 1024 / 1024 / 1024 as required_size_gb
from __all_virtual_meta_table
where tenant_id = 1001
group by zone, svr_ip, svr_port;

For V4 the query uses CDB_OB_TABLET_REPLICAS:

--
select
  svr_ip,
  svr_port,
  sum(data_size) / 1024 / 1024 / 1024 as data_size_gb,
  sum(required_size) / 1024 / 1024 / 1024 as required_size_gb
from CDB_OB_TABLET_REPLICAS
where TENANT_ID = 1004
group by svr_ip, svr_port;

Key observations from the example:

Tenant 1001 has three zones (three replicas).

Each replica stores about 2.3 TB of real data ( data_size_gb) and requires about 2.6 TB of disk space ( required_size_gb).

The Oracle‑mode statistics can also be obtained via DBA_SEGMENTS in the business tenant.

3 Table‑level Statistics

To get the size of a specific table, use the DBA_SEGMENTS view in Oracle mode:

select
  owner,
  segment_name,
  round(bytes / 1024 / 1024, 2) as "SIZE(M)"
from DBA_SEGMENTS
where segment_name = 'ETL_P10IDS_RISKCON' and owner = 'LIFE';

In the sys tenant the equivalent query is:

select
  meta.zone,
  meta.svr_ip,
  t.database_name,
  t.table_name,
  round(meta.data_size / 1024 / 1024, 2) as "DATA_SIZE(M)",
  round(meta.required_size / 1024 / 1024, 2) as "REQUIRED_SIZE(M)"
from __all_virtual_meta_table meta
join gv$table t on meta.table_id = t.table_id
where t.table_name = 'ETL_P10IDS_RISKCON' and t.database_name = 'LIFE';

For V4, table and index sizes can be aggregated separately:

select y.SVR_IP, y.DATABASE_NAME,
       case when y.TABLE_TYPE = 'INDEX' then '' else y.TABLE_NAME end as TABLE_NAME,
       y.TABLE_TYPE,
       sum(y.DATA_SIZE) as "DATA_SIZE(MB)",
       sum(y.REQUIRED_SIZE) as "REQUIRED_SIZE(MB)"
from (
  select a.TENANT_ID, a.SVR_IP, a.TABLET_ID, b.table_id, b.DATABASE_NAME, b.TABLE_NAME, b.TABLE_TYPE,
         round(a.data_size/1024/1024,2) as DATA_SIZE,
         round(a.required_size/1024/1024,2) as REQUIRED_SIZE
  from CDB_OB_TABLET_REPLICAS a
  join cdb_ob_table_locations b on a.TABLET_ID=b.TABLET_ID and a.svr_ip=b.svr_ip and a.tenant_id=b.tenant_id
  where a.TENANT_ID=1004 and b.DATABASE_NAME='test' and (
        b.TABLE_NAME='sbtest3' or
        b.DATA_TABLE_ID in (select table_id from cdb_ob_table_locations where TENANT_ID=1004 and TABLE_NAME='sbtest3')
      )
) y
group by y.SVR_IP, y.DATABASE_NAME, y.TABLE_TYPE
order by y.SVR_IP, y.DATABASE_NAME asc, TABLE_NAME desc;

The agile diagnostic tool obdiag can also report table size, but it only counts one follower replica and does not include the require_data_size on disk. A known bug (now fixed) caused partitioned tables to report only a single partition's size.

Example command:

obdiag gather tabledump --user=root@my#hucq_421 --password='aaBB11@@' --database=test --table=sbtest3

Overall, the article provides practical SQL snippets and tooling tips to accurately measure OceanBase data size at different granularities, helping administrators and developers understand storage consumption and plan capacity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringSQLdatabaseOceanBaseDataSizeobdiag
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.