How Ctrip Migrated a Decade‑Old Hotel Order System from SQL Server to MySQL

Ctrip’s hotel order platform evolved from a single SQL Server instance to a multi‑IDC, sharded architecture, and this article details the middleware‑driven migration, unified caching, data‑chain redesign, consistency safeguards, and operational practices that enabled seamless, high‑performance order processing at millions of transactions per day.

dbaplus Community
dbaplus Community
dbaplus Community
How Ctrip Migrated a Decade‑Old Hotel Order System from SQL Server to MySQL

Since the first hotel order was recorded in 1999, Ctrip’s order storage has progressed from a single SQL Server database to a multi‑IDC, sharded solution, but the legacy design eventually hit capacity, performance, and cost limits. To support daily tens of millions of orders with high availability, low latency, and controllable costs, the team built a unified middleware layer that abstracts data access, introduces a global hot‑cache, and enables a transparent migration from SQL Server to MySQL.

Architecture Overview

The new middleware sits between applications and the databases, providing a unified data‑chain service. It decouples applications from direct DB connections, allowing the underlying storage to shift from SQL Server to MySQL without code changes. A diagram (Fig 1.1) illustrates the layered architecture: application → middleware → cache → DB (SQL Server/MySQL).

Key Application Scenarios

Sub‑second order synchronization : New orders are written to the DB via middleware, which simultaneously builds a cache entry. After the DB commit, an order message is pushed to downstream systems, an Elasticsearch index is created for search, and the data is archived to the data warehouse (T+1).

Automatic order dispatch & workbench : After a guest places an order, the middleware quickly forwards order details to merchants for confirmation and to internal workbenches for manual handling.

Online query & analytics : High‑QPS online queries hit the hot cache, while long‑running analytics use a Hive data‑warehouse pipeline.

System Upgrade Practices

The migration required hot‑migration with zero‑downtime. Middleware abstracts the data source, enabling dual‑write (SQL Server + MySQL) during the transition. The team implemented fine‑grained traffic control at the CRUD level, supporting whole‑DB, shard‑level, table‑level, or even individual operation routing.

1. Distributed Order Cache

Before caching, each service connected directly to the DB, causing duplicated queries and a single‑point‑of‑failure risk. The new cache stores hot order data in Redis; on a cache miss, the DB is queried and the result is written back to Redis. To reduce Redis key count, a domain‑based hash structure stores all related tables for an order under a single key (order‑id → {domain: value}), compressing values with protostuff to save space and network traffic.

2. Lossless Migration Technique

Dual‑write is performed with SQL Server as the primary writer and MySQL as the secondary. A synchronization job continuously compares recent changes and repairs inconsistencies, writing corrections to Elasticsearch for audit. The middleware’s configuration system can route reads/writes per table, allowing gradual migration from dual‑write to MySQL‑only reads and writes.

3. Data‑Warehouse Integration

The team follows a five‑layer ODS‑DIM‑EDW‑CDM‑ADM model. During migration, the ODS layer’s source is switched from SQL Server to MySQL. Consistency checks involve online jobs that compare row hashes and offline scripts that generate per‑table comparison jobs, handling date‑precision differences and XML‑type incompatibilities.

Core Challenges & Solutions

Fine‑grained traffic monitoring : Every CRUD operation is logged to Elasticsearch and visualized in Kibana/Grafana. DBTrace shows which DB executed each SQL, ensuring each migration step only proceeds after the previous one is verified.

Dual‑write consistency : Inconsistencies arise from missed writes, network glitches, or GC pauses. The team uses a job that treats SQL Server as the source of truth, repairs MySQL, and records diffs in ES. A fail‑over mechanism repeatedly retries until consistency is achieved.

Cache consistency : The team adopted a "write DB then delete cache" strategy with delayed double‑delete for sensitive tables. Optimistic locks and CUD markers prevent stale reads; queries bypass the cache when a lock or marker is present.

Database feature differences : Auto‑increment keys are synchronized by writing the generated SQL Server ID back to MySQL. Date‑time fields are truncated to seconds for comparison. XML fields unsupported in MySQL 5.7 are compressed and stored as VARCHAR, with length checks before insertion.

Alerting & Reliability Practices

Real‑time alerts monitor data‑drift between the two databases, order volume per shard, and cache health. A unified alert platform (Sitemon) duplicates alert rules for both data sources, enabling gray‑scale validation. The team also conducts regular "Wandering Earth" disaster‑recovery drills, simulating Redis failures, DB outages, and network partitions, and has added automatic middleware degradation based on observed latency spikes.

Future Plans

Develop a manual cache control console for NOC operators to isolate faulty Redis shards.

Implement automatic middleware degradation when cache‑related metrics exceed thresholds.

Integrate the middleware with a Service Mesh to simplify cloud migration and improve observability.

Overall, the migration showcases a comprehensive approach to modernizing a legacy order system, balancing performance, cost, and reliability while ensuring data integrity throughout the transition.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

middlewareorder processingmysqlSQL Server
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.