How to Migrate from SQL Server to MySQL: Strategies, Tools, and CDC Implementation
This article details the challenges of moving from SQL Server to MySQL, compares offline and online migration approaches, presents a complete toolbox—including ETL utilities, consistency checkers, CDC configuration, and rollback mechanisms—and shares practical lessons from a large‑scale production migration.
Founded in 2001, Hujiang originally built its platform on early Microsoft technologies (ASP, .NET, SQL Server). After years of growth the monolithic .NET stack limited hiring, ecosystem integration, and architectural optimization, prompting a large‑scale "de‑Windows" effort that migrates the codebase from C# to Java and the database from SQL Server to MySQL.
Design Goals for the Migration
Data consistency before and after migration
Minimize business downtime
Avoid invasive changes to application code
Support schema refactoring and field adjustments
Two migration patterns were evaluated: a short‑downtime "stop‑the‑world" approach for low‑volume services and an online, near‑zero‑downtime approach for high‑traffic systems.
Offline (Stop‑the‑World) Migration Flow
The process uses an ETL tool to extract data from SQL Server, transform it, load it into MySQL, then run consistency checks before switching the application to the new database.
Online Migration Flow
Full data is first copied, then incremental changes are captured in near‑real‑time. After the lag drops to seconds, a brief pause (seconds) is taken to flip the application to the new MySQL instance.
Key Problems to Solve
One‑to‑one mapping of schema objects between SQL Server and MySQL
Differences in data‑type usage and SQL dialect
Ensuring data consistency across the cut‑over
Supporting schema changes during migration
Achieving online migration without long downtime
Rollback strategy for post‑migration failures
Documentation & Reference Materials
SQL Server → MySQL datatype mapping table
MySQL usage guide and pitfalls
ETL tools that support schema changes (SQL Server → MySQL)
Online ETL tool (yugong) with custom extensions
Consistency‑check utility
Rollback utility
ETL Tool Landscape
Common utilities for homogeneous migrations include mysqldump, mysqlimport, pt-table-sync, and XtraBackup. For heterogeneous migrations the team evaluated Percona tools, DataX, yugong, DB2DB, MySQL Workbench, Kettle, Ispirer, Navicat, and DBImport. After filtering for performance and flexibility, three solutions were selected:
DB2DB – simple stop‑the‑world migration for small datasets
DataX – large‑scale stop‑the‑world migration using JSON configuration
yugong – customizable online migration framework extended to support SQL Server as both source and target
Consistency Check Tool
After ETL, a verification step ensures that source and target data match. The team examined Percona's pt-table-checksum, which is designed for MySQL master‑slave replication, and adapted its ideas. Yugong provides four modes (CHECK, FULL, INC, AUTO); the CHECK mode was used as a CDC‑style consistency verifier that reads primary‑key ranges via JDBC and compares batches.
Rollback Strategy
When a migration succeeds but later reveals critical bugs, the system must revert changes made after the cut‑over. The naive approach of instrumenting DAO layers was rejected due to invasiveness. Instead, the team leveraged MySQL binlog (row‑based) to capture post‑migration changes, parsed them with Alibaba's Canal, and replayed them to SQL Server using a custom Yugong writer. This creates a reversible ETL pipeline similar to the online migration flow.
Operational Practices
Before each migration the team performed full‑scale dry‑runs in a replica environment matching production hardware and data volume. Detailed runbooks and incident‑response plans were prepared. Metrics (memory, CPU, I/O, transaction rates) were monitored, and a set of health indicators defined for CDC performance testing.
Change Data Capture (CDC) in SQL Server
CDC captures row‑level changes from the transaction log into change tables (e.g., cdc.dbo_fruits_CT). Enabling CDC involves:
-- enable CDC for the database
sys.sp_cdc_enable_db;
-- enable CDC for a specific table
EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'fruits', @role_name = NULL;
-- list CDC‑enabled tables
SELECT name, is_cdc_enabled FROM sys.databases WHERE is_cdc_enabled = 1;The change table contains metadata columns ( __$start_lsn, __$end_lsn, __$operation, etc.) plus the original data columns. Sample query to retrieve changes between two timestamps:
DECLARE @begin_time datetime, @end_time datetime, @begin_lsn binary(10), @end_lsn binary(10);
SET @begin_time = '2017-09-11 14:03:00.000';
SET @end_time = '2017-09-11 14:10:00.000';
SELECT @begin_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than', @begin_time);
SELECT @end_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);
SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_fruits(@begin_lsn, @end_lsn, 'all');The __$operation column indicates the type of DML (1 = delete, 2 = insert, 3 = pre‑update, 4 = post‑update), allowing reconstruction of the exact sequence of changes.
CDC Tuning & Performance Testing
Key CDC parameters:
maxscans – maximum scan attempts per job (default 10)
maxtrans – maximum transactions scanned per batch (default 500)
pollinginterval – seconds between scans (default 5 s)
Performance tests (30 concurrent writers inserting 1 M rows) showed that CDC can sustain ~16 k TPS with negligible CPU/IO impact. Recommended optimizations include adjusting the three parameters, reducing immediate post‑insert scans, limiting tracked columns, disabling net‑changes when not needed, and cleaning up old CDC data during low‑load periods.
Online Migration with Yugong
Yugong, originally built for Oracle→MySQL, was extended to consume SQL Server CDC records and write them to MySQL. The framework abstracts source, target, and SQL templates, allowing the team to plug in a custom SQL Server reader and a MySQL writer. The source code for the SQL Server‑enabled Yugong version has been open‑sourced.
Rollback via Canal & Redis Queue
MySQL binlog (row‑based) is captured by Canal, which streams changes to a Redis queue. This decouples consumption from the source and enables replay to SQL Server if a rollback is required. The pipeline is:
Best‑Practice Checklist
The team codified a step‑by‑step checklist covering environment preparation, data‑size replication, dry‑run rehearsals, monitoring setup, and post‑migration validation. This checklist has been applied to 28 night‑time migrations covering 42 systems and nearly 100 billion rows, all completed successfully on the first attempt.
References
MySQL :: Guide to Migrating from Microsoft SQL Server to MySQL: https://www.mysql.com/it/why-mysql/white-papers/guide-to-migrating-from-sql-server-to-mysql/
mysqldump documentation: https://dev.mysql.com/doc/refman/5.7/en/mysqldump.html
mysqlimport documentation: https://dev.mysql.com/doc/refman/5.7/en/mysqlimport.html
pt-table-sync (Percona): https://www.percona.com/doc/percona-toolkit/LATEST/pt-table-sync.html
XtraBackup (Percona): https://www.percona.com/software/mysql-database/percona-xtrabackup
Database migration and synchronization tools: https://www.convert-in.com/
DataX (Alibaba): https://github.com/alibaba/DataX
yugong (Alibaba): https://github.com/alibaba/yugong
MySQL Workbench: https://www.mysql.com/cn/products/workbench/
Kettle (Data Integration): https://community.hds.com/docs/DOC-1009855
Ispirer migration tool: https://www.ispirer.cn/products/sql-server-to-mysql-migration
DB2DB migration tool: http://www.szmesoft.com/DB2DB
Navicat Premium: https://www.navicat.com/en/products/navicat-premium
DBImport (community tool): http://www.cnblogs.com/cyq1162/p/5637978.html
SQLAdvisor (Meituan‑Dianping): https://github.com/Meituan-Dianping/SQLAdvisor
Oracle Materialized View concepts: https://docs.oracle.com/cd/B10500_01/server.920/a96567/repmview.htm
SQL Server CDC performance tuning: https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008/dd266396(v=sql.100)
Alibaba Yugong project: https://github.com/alibaba/yugong
Alibaba Canal project: https://github.com/alibaba/canal
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
