Ensuring Data Integrity in Multi‑Cloud Migration: MySQL, File & Object Storage Strategies
This article outlines practical techniques for multi‑cloud data migration, covering challenges, resource categories, migration phases, and detailed synchronization and verification methods for MySQL databases, file systems, and object storage using tools such as UDTS, pt‑table‑checksum, sync_diff_inspector and US3SYNC.
With the rise of multi‑cloud deployments, enterprises face critical challenges in migrating and synchronizing data across clouds, especially regarding data integrity, limited migration windows, and complex application dependencies.
Common Challenges
Data integrity and consistency
Time‑sensitive migration windows
Application dependencies and call relationships
Resource Types Involved
Three major categories of resources need attention during a cross‑cloud move:
Network services (EIP, VPC, load balancers, NAT gateways) – typically stateless and can be aligned manually.
Compute instances – migrated using UCloud Server Migration Center (USMC) with minute‑level incremental sync.
Storage services (databases, file storage, object storage) – migrated with the UDTS data‑transfer tool, which is the focus of this article.
Migration Phases
The process is divided into three stages: data synchronization, data cleaning (removing test‑generated dirty data), and cut‑over. The synchronization stage is the core, requiring both data copy to the new platform and validation that applications run correctly on the target.
MySQL Synchronization
Two main approaches are used:
Traditional MySQL master‑slave: capture binlog position, dump data up to that point, restore a standby, then replay binlog for incremental changes.
UDTS tool: performs an initial bulk copy followed by binlog‑based incremental sync, relying on the same consistency mechanisms as MySQL replication.
Case example: A company with dozens of MySQL instances found the manual dump‑import‑configure‑replication workflow cumbersome, so they adopted UDTS, allowing DBAs to provide source and target credentials and let the tool handle the sync.
Data Consistency Details
InnoDB’s transactional engine and MVCC provide robust consistency, whereas MyISAM lacks such guarantees and requires read locks that block writes. UDTS leverages InnoDB‑style transaction isolation to maintain consistency during migration.
Data Verification
Beyond sync mechanisms, verification ensures true consistency:
pt‑table‑checksum: creates a temporary checksum table, sets binlog format to STATEMENT, computes chunk‑wise checksums on the master, syncs them to the replica, and compares results. It requires a master‑slave relationship and indexed tables.
sync_diff_inspector: an open‑source TiDB tool that compares two MySQL instances without requiring replication or indexes, supports different database/table names, and pinpoints mismatched rows via binary search.
Both tools can be slow on large datasets (e.g., a 500 GB database took ~28 hours). In practice, simple row‑count or max‑ID checks are used during cut‑over windows.
File Storage Synchronization
File storage migration relies on rsync‑like copying and MD5 checksums. Detecting changes is challenging, especially for NFS where inotify does not propagate across hosts. A typical approach records file‑level events in business logs, performs bulk rsync, then replays logged changes during cut‑over.
FastDFS provides a binlog‑like mechanism to track file operations (create, append, delete) across replicas, enabling incremental sync.
File Verification
Static storage can suffer silent errors (e.g., bad sectors). Full‑link verification—generating checksums before upload, storing them with the object, periodically recomputing, and verifying on read—mitigates this risk. Commands such as:
find ./ -type f -print0 | xargs -0 md5sum > ./my.md5and md5sum -c my.md5 are used for bulk checksum generation and validation.
Object Storage Synchronization
Object storage sync sits between database and file storage in complexity. Using the US3SYNC tool, data from any S3‑compatible source can be migrated to UCloud US3. Initial sync should avoid configuring the target bucket as a mirror source before data exists, to prevent massive back‑origin traffic.
US3SYNC works with Redis to track processed keys, enabling resumable transfers after failures. It also supports a “compare mode” that checks source object metadata (e.g., HEAD size) and only transfers missing or differing objects, reducing memory pressure when dealing with billions of keys.
Object Storage Verification
Most object stores expose an ETag header derived from the file content. By reproducing the source’s ETag calculation and comparing it with the target’s returned ETag (or US3SYNC’s own calculation), integrity can be confirmed.
Conclusion
Multi‑cloud deployment is now a norm, and UCloud’s experience with USMC and UDTS demonstrates that, with careful planning and the right tools, massive data can be migrated within tight windows while preserving integrity and consistency. This article covered MySQL, file, and object storage synchronization; the next installment will discuss data cleaning and cut‑over techniques.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
