Practical Guide to Large‑Scale Data Migration Using Sharding‑Proxy
This article presents a step‑by‑step practice of migrating massive billing data to 32 sharded databases with Sharding‑Proxy, covering background, objectives, four migration solutions, detailed proxy installation and configuration, debugging, migration workflow, data validation, common issues and their resolutions.
The article introduces a practical migration workflow based on Sharding‑Proxy, enabling readers to understand the complete data migration process for large‑scale billing data.
Background & Objectives : Rapid growth of billing data requires splitting the legacy database into 32 shards, ensuring smooth migration with minimal downtime and supporting rollback for individual databases.
Solution Options : Four approaches are compared – (1) open‑source Sharding‑Proxy, (2) Hive middleware, (3) semi‑self‑developed program (DTS + JDQ), and (4) fully self‑developed program. A table evaluates forward/reverse support, high‑availability, performance, and lists pros and cons for each.
Proxy Overview : Sharding‑Proxy is a transparent database proxy supporting MySQL and PostgreSQL protocols. Its architecture consists of a NIO‑based frontend, the Sharding‑Core for SQL parsing/routing, and a backend using HikariCP for real database connections.
Installation : Download the appropriate version (e.g., 4.1.1) from the official site, unzip, place the MySQL driver JAR (mysql‑connector‑java‑5.1.44.jar) into the lib directory, and start the proxy with ./start.sh in the bin folder.
Configuration : Key configuration files include server.yaml (governance, user permissions, data source parameters), config‑sharding.yaml (schema‑to‑datasource mapping, sharding rules), and logback.xml (logging). Example snippets:
spring.shardingsphere.datasource.names=defaultmaster,slave0,slave1
spring.shardingsphere.sharding.default-data-source-name=groupname1
spring.shardingsphere.sharding.master-slave-rules.groupname1.master-data-source-name=defaultmaster
spring.shardingsphere.sharding.master-slave-rules.groupname1.slave-data-source-names[0]=slave0
spring.shardingsphere.sharding.master-slave-rules.groupname1.slave-data-source-names[1]=slave1
spring.shardingsphere.sharding.master-slave-rules.groupname1.load-balance-algorithm-type=round_robinDebugging & Testing : Connect to the proxy using Navicat or the MySQL client, verify that queries without a sharding key scan all shards, while queries with the sharding key are routed to the correct physical database.
Data Migration Steps : (1) Deploy Sharding‑Proxy in production, (2) Create DTS migration tasks for each shard and start synchronization, (3) Perform data integrity checks – full data comparison, time‑slice sampling, random row verification, and full‑volume validation.
Common Issues & Solutions : Empty sharding keys are unsupported; set a default key or upgrade to Sharding‑Proxy 5.x which allows key updates. For update‑on‑sharding‑key errors, either upgrade the proxy version or adjust DTS to exclude the key from updates.
Multi‑Master/Slave Example : Spring configuration for multiple slaves demonstrates how to define master‑slave groups and load‑balance using round‑robin.
References : Official ShardingSphere documentation, download links, MySQL connector page, Logback site, and several technical blog posts are provided for further reading.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
