How to Seamlessly Migrate a Live System to Sharding with Dual‑Write and Diff
This article explains how Qunar’s ticket ancillary service upgraded from a single-database architecture to a sharded one without downtime, detailing dual-write, transaction handling, mapping-key routing, diff verification, the challenges encountered, and a component-based solution that makes future migrations reusable.
Background
Sharding (splitting databases and tables) is a common optimization for large‑scale internet applications. Middleware such as sharding‑jdbc or MyCAT can handle most sharding needs, but the real difficulty lies in upgrading an existing monolithic data layer to a sharded one without downtime.
Original Problem
Qunar’s ticket ancillary business built its own sharding middleware qdb, which only solves the “how to shard” problem. Directly switching to qdb raises three critical issues:
How to roll back if data errors occur during migration, especially when new data may have been written to the new shards.
How to handle SQL that does not contain the sharding key, because rewriting all legacy SQL is impractical.
How to verify that the system after sharding is functionally equivalent to the pre‑sharding system.
First Smooth Migration Practice
The first migration adopted three techniques that correspond to the three problems above:
Dual‑write: write to both the old single‑database and the new sharded database.
Special transaction handling to keep the two data sources consistent.
Using an “iff” (if‑and‑only‑if) check to confirm data equivalence between the two databases.
Pre‑knowledge: MyBatis and MyBatis‑Spring
Understanding MyBatis’s architecture is essential. The framework consists of three layers:
Interface layer : the SqlSession API and Mapper interfaces.
Data‑processing layer : parameter handling, SQL parsing, execution, and result mapping.
Infrastructure layer : connection management, transaction handling, configuration loading, and caching.
MyBatis‑Spring registers Mapper interfaces via @MapperScan, creates MapperFactoryBean instances, and injects a SqlSessionTemplate that delegates to the underlying SqlSessionFactory.
Dual‑Write Implementation
During the upgrade the original single‑database remains active while a sharded database is built according to business‑line and month‑based rules. All write operations are intercepted by a MyBatis plugin that rewrites the SQL and routes it to both data sources. Reads are performed against the old database until the router confirms that read‑and‑write results are equivalent, after which the old database can be decommissioned.
The plugin also integrates with Qunar’s configuration center (qconfig) to roll out the switch gradually.
Key Technical Challenges
Two major problems arise:
How to switch the target datasource inside MyBatis.
How to duplicate the SQL execution in the new datasource and replace the original result.
The solution involves:
Obtaining a new SqlSession from the sharded datasource when a transaction is active.
Setting autoCommit=false on the new connection and manually managing reference counts to avoid connection leaks.
Creating a separate Statement for the sharded execution because each Connection owns its own Statement.
Isolating the sharded execution in a child context so that any changes to parameters or thread‑local variables do not affect the parent context.
Mapping‑Key Strategy
When a query lacks the sharding key, a “mapping key” is used. The mapping key is looked up in a dedicated mapping table, which yields the actual sharding key and therefore the physical table location. Example: a coupon code (couponId) maps to an order ID (orderId), allowing the query to use only couponId. An indirect mapping based on order creation time is also supported.
For full‑table scans (e.g., cleaning expired coupons) the component provides an API that iterates over every physical database and table, invoking a user‑supplied callback for each batch.
Diff and Transaction
To prove equivalence, a diff is performed. Offline diff compares all historical data; online (real‑time) diff checks the result of each write operation. When the diff converges to zero within 24 hours, the two sides are considered equivalent.
Because both the old and new databases must be updated in a single logical transaction, a “distributed‑like” transaction is implemented: the primary Spring transaction manages the old datasource, while a manual transaction wrapper handles the sharded datasource, synchronising commit/rollback via Spring’s TransactionSynchronization callbacks.
New Issues After First Migration
Testing is cumbersome because the sharding environment must be started for unit tests.
Maintenance overhead is high due to many annotations and configuration entries.
Limited reusability; the sharding rules are hard‑coded and difficult to change (e.g., switching from monthly to weekly sharding).
Component‑Based Migration Solution
The second iteration wraps the entire migration logic into a reusable component consisting of three layers:
Access layer : stable APIs, Spring starters, and annotations.
Core layer : routing logic, lifecycle management, and plugin points.
Storage layer : actual SQL execution and transaction coordination (MyBatis + datasource).
The component lives in two Maven modules: qmall_db_core (core + storage) and qmall_db_shell (access). Configuration is driven by a sharding.properties file placed in resources. An excerpt of the file is shown below.
# Database prefix (for qdb compatibility)
db.prefix=qmall_supply_
# Database index per business line
db.index.qmall.flight={dbIndex: 0}
db.index.qmall.inter={dbIndex: 1}
db.index.qmall.ticket={dbIndex: 2}
db.index.qmall.hermes={dbIndex: 3}
# Sharding key configuration
sharding.user_info=[{shardingKey:'last_name',intervalMonth:2,hashCount:0,startTime:'2020-11-01'}, ...]
sharding.supply_order=[{shardingKey:'supply_order_id',intervalMonth:1,hashCount:2,startTime:'2022-11-01',hashGroupReg:'20[0-9]{2}(0[1-9]|1[0-2])[0-9]{6}'}]
# Mapping‑key configuration (higher priority wins)
table.supply_order=[{mapKey:'business_order_id',type:'one2many',priority:1,maintain:'auto_manual'}]
table.user_info=[{mapKey:'id',type:'one2one',priority:1}, {mapKey:'phone',type:'one2one',priority:1,maintain:'auto_manual'}]By adjusting the sharding.properties file, a single‑table can be turned into a sharded table without code changes. Different environments (local, test, production) can supply different property files, allowing seamless local testing with a single database and production with full sharding.
Summary
The article presents two migration designs. The first applies dual‑write and diff to safely migrate a long‑running system, using a mapping‑key mechanism to avoid massive SQL rewrites. The second packages the same ideas into a component that isolates the sharding logic, simplifies configuration, and supports DDD‑driven micro‑service refactoring.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
