ActionOMS: Intelligent Performance Diagnosis for OceanBase Data Migration
The article introduces ActionOMS, a customized version of OceanBase Migration Service that adds automated performance and fault diagnosis, explains its architecture, showcases a real‑world case of Oracle‑to‑OceanBase synchronization, and demonstrates how the tool improves migration throughput and reduces latency.
1 Case Background
A customer needed to synchronize Oracle data to OceanBase (MySQL mode) and required OceanBase Migration Service (OMS) for data migration.
OceanBase Migration Service (OMS) provides one‑stop data transfer and synchronization between various relational databases, message queues, and OceanBase, supporting real‑time sync and incremental subscription.
Data Synchronization Considerations
Ensure no data loss and consistency between source and target.
Improve synchronization efficiency: lower latency and higher RPS.
2 ActionOMS
ActionOMS is a customized version of OMS developed by ActionDB, fully authorized by OceanBase, allowing source‑level debugging and custom development.
Version Introduction
Released in July 2024 (v4.24.07.0), it adds intelligent performance and fault diagnosis using a top‑down quantitative approach to pinpoint issues from processes down to threads and queues.
Intelligent diagnosis automatically collects performance metrics, analyzes anomalies, and presents precise latency fault points with adjustment suggestions.
3 Practical Case
The customer required detailed analysis of performance‑impacting factors during synchronization.
ActionOMS rebuilt the latency diagnosis logic, automatically collecting component metrics and applying SRE‑style diagnostics to provide systematic, accurate results and optimization suggestions.
Through the new incremental latency diagnosis, the system identified a 12‑minute delay caused by a blockage in the log replay stage.
Structure Explanation
First Layer: Key Sync Nodes
Shows delay time = current time – latest incremental record time, and metric generation time.
Normal nodes (log collection, format conversion, cache) are blue with low latency; the log replay node is red with ~12‑minute delay.
Second Layer: Red Node Analysis
Details the blocked sub‑processes (store receive & merge, ETL conversion, transaction ordering, target replay).
The transaction ordering node is blocked, with cache usage >50%, leading to the diagnosis conclusion and specific adjustment recommendations.
4 Effect of Adjustments
After applying the suggested parameter changes, migration flow and RPS increased by 2‑3×, and the incremental data completed sync around 12 minutes with latency dropping.
5 Feature Details
Automated Collection of Performance Metrics
The end‑to‑end sync involves extraction, parsing, caching, and replay, each with multiple processes, threads, and queues; collecting all related metrics is essential for diagnosis.
System‑Level Analysis of Anomalous Metrics
ActionOMS uses a top‑down method to evaluate metrics such as _thread_used_1m , _thread_rps_1min , _queue_depth , and _thread_idling_1min to detect performance issues, including uneven thread scheduling, thread stalls, and RAC multi‑instance traffic imbalance.
Precise Presentation of Latency Fault Points
The tool follows a right‑to‑left approach, exposing downstream issues first and then iteratively diagnosing upstream components.
Supplementary Notes
A one‑click stack snapshot button collects full process stacks and flame graphs for manual analysis when automated methods are insufficient.
6 Summary
Automatic diagnostics enable real‑time monitoring of sync systems, quickly locating root causes and providing adjustment suggestions, thereby enhancing robustness, reducing operational workload, and ensuring stable data synchronization under varying business loads.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.