How Ctrip Migrated MySQL to OceanBase: Tools, Process, and Lessons Learned
Ctrip evaluated and extended OceanBase Migration Assessment tools, built a one‑click migration workflow, implemented comprehensive monitoring and automatic fault‑diagnosis pipelines, and addressed compatibility challenges such as .NET charset issues and Druid parser errors, ultimately achieving a smooth MySQL‑to‑OceanBase transition.
1. Introduction
MySQL has supported Ctrip's business for years, but growing data‑model diversity, OLTP/OLAP convergence, high slow‑query remediation cost, and unfriendly sharding solutions exposed limitations. OceanBase, an open‑source distributed database compatible with most MySQL syntax, offers horizontal scalability, strong consistency and high availability, prompting a migration project.
2. Evaluation Tool
To ensure a smooth heterogeneous‑database migration we must check compatibility, performance and partition suitability. OceanBase Migration Assessment (OMA) provides syntax compatibility and performance checks, but Ctrip needed additional checks:
Middleware version verification for all applications accessing a DB.
Safe MySQL General Log collection and replay, especially for high‑traffic instances.
Compatibility checks for non‑middleware accounts (ETL, query tools, direct connections) that require tenant‑aware login after migration.
Partition‑key recommendation to avoid hotspot data.
The OMA tool was extended to perform one‑click assessment, collect MySQL data safely, import it into OceanBase, and use the open‑source Locust tool for SQL replay and load testing, producing a detailed assessment report.
3. Migration Process
Pre‑migration configuration validation to eliminate risky settings.
Create tenant‑aware OceanBase accounts because OB uses a multi‑tenant model; MySQL accounts must be recreated with tenant information.
Data consistency verification after CDC sync, using primary‑key chunking to compare result sets, with retries for hotspot tables.
Pause DDL on the source MySQL during migration; emergency releases can temporarily abort the process.
Build a reverse CDC sync link (OceanBase CDC service) so that any failure can be rolled back quickly, keeping both sides consistent.
When data sync finishes with no lag, a switch task is generated. The switch can be triggered within a minute, requires no application changes, and the reverse link is retained for weeks as a safety net.
4. OceanBase Monitoring
Distributed databases need a global view and alarm aggregation. Each server runs an Agent that collects metrics locally and pushes them to Hickwall, forming a unified monitoring dashboard. Alerts are defined on threshold breaches, and the system also performs periodic configuration checks.
OceanBase also integrates Ctrip's SQL‑audit pipeline, capturing full MySQL‑protocol packets to record query text, client ID, IP, parameters, and error status, enabling rapid pinpointing of problematic SQL.
5. Automatic Fault Diagnosis
A real‑time performance data warehouse collects metrics every 10 seconds, stores them in ClickHouse, and classifies them into three categories:
SQL: execution count, total time, CPU time, logical/physical reads.
Table: DML row count, related SQL count, transaction count.
Perf: CPU, I/O, RPC latency, index cache size, hit rate.
Automated analysis matches abnormal metrics with historical data, identifies the offending SQL, table, or internal OceanBase component, and generates a fault‑analysis report that is emailed to DBAs and developers.
Concrete cases show the workflow: a CPU spike at 4:30 am is linked to increased accesses on a specific table, which in turn is caused by a single slow SQL statement; the root cause is then addressed by adding server capacity.
6. Migration Issues and Practices
6.1 .NET client charset mismatch
The official MySQL connector for .NET expects ConnectionCharSetIndex=83 (utf8_bin), which OceanBase lacks (only utf8mb4_bin). The OceanBase source was patched to add the missing charset, enabling .NET applications to connect.
6.2 Druid SQL parser incompatibility
OceanBase DDL caused parsing errors in Druid. Investigation revealed missing implementation in Druid's SQLIndexOptions. A pull request was submitted and merged, fixing the compatibility issue.
6.3 Read‑write splitting
OBProxy was extended with enable_weak_read and weak_read_user_list parameters, allowing transparent read‑write splitting without code changes in applications.
6.4 Large IN‑list memory overflow
Queries with >10 000 elements in an IN clause triggered stack overflow and exhausted the SQL arena memory. The optimizer was fixed in a later OceanBase version, and users were advised to limit IN list size.
6.5 Execution‑plan correction
OceanBase’s optimizer may choose sub‑optimal plans due to low‑frequency histogram updates. Users can force a plan by adding an outline comment to the SQL, overriding the optimizer’s choice.
7. Future Outlook
OceanBase 4.0 introduces a unified single‑node‑and‑distributed architecture, allowing lightweight single‑node deployment that can be scaled to a full distributed cluster. Compatibility with MySQL has been further improved (character sets, constraints, functions, stored procedures, binlog, Canal, flashback tools). Continued work on monitoring dashboards and automated diagnostics aims to lower operational overhead and increase system stability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
