Databases 15 min read

Migrating AutoHome Community from SQL Server to TiDB: Architecture, Testing, and Lessons Learned

This article details the AutoHome community's migration from a monolithic SQL Server database to the distributed TiDB platform, covering the performance bottlenecks that prompted the change, the evaluation of candidate databases, extensive OLTP/OLAP testing, the full‑ and incremental‑sync migration strategy, rollback mechanisms, and the resulting operational improvements.

HomeTech
HomeTech
HomeTech
Migrating AutoHome Community from SQL Server to TiDB: Architecture, Testing, and Lessons Learned

AutoHome community, launched in 2005, accumulated billions of posts and replies, serving tens of millions of daily active users and generating over 1 billion daily API calls. The legacy SQL Server storage could no longer scale, leading to the need for a new database solution.

The team identified two critical bottlenecks: the existing sharding design made it difficult to implement new features that required logical data aggregation, and the limited disk capacity of servers caused costly, frequent hardware upgrades. An online, seamless scaling solution was required.

In late 2018 a virtual architecture group evaluated three popular distributed databases—TiDB, Ignite, and CockroachDB—and selected TiDB for its MySQL compatibility, strong Raft‑based consistency, active community, and close collaboration with the TiDB team.

TiDB’s HTAP capabilities address the identified pain points: horizontal scalability, support for billions of rows, high availability via Raft, and built‑in OLAP support (with TiSpark for advanced analytics).

Extensive testing showed TiDB’s superiority: in OLTP tests with 20 M rows and 500 concurrent threads, 99 % of responses were under 16 ms; in a 50 GB TPC‑H benchmark TiDB outperformed MySQL by a large margin. Detailed query latency results are listed in the table.

The migration plan consists of a full‑sync phase using the open‑source Yugong tool (forked to support SQL Server) and an incremental sync phase leveraging SQL Server Change Data Capture (CDC). Yugong’s ETL pipeline was extended with a custom extractor that discovers tables automatically, a RowDataMergeTranslator that discards original auto‑increment keys, and an applier that writes directly to TiDB.

Key code snippets:

https://github.com/alibaba/yugong
https://github.com/alswl/yugong
#查询表
SELECT name FROM sys.databases WITH (nolock) WHERE state_desc = 'ONLINE'
#查询开启CDC的表
SELECT name FROM %s.sys.tables t WITH (nolock) JOIN %s.[cdc].[change_tables] ct WITH (nolock) ON t.object_id = ct.source_object_id
record.removeColumnByName(config.getDiscardKey());

During incremental sync, CDC logs are read, transformed into messages, and published to Kafka. Consumers apply the changes to TiDB and also perform delayed validation by re‑reading the written rows; if discrepancies are found, retries and alerts are triggered. To avoid CPU overload, only the most active tables are polled frequently.

Rollback is handled via TiDB’s binlog (Pump and Drainer) streamed to Kafka, allowing data to be replayed back to SQL Server if TiDB encounters unrecoverable issues.

Post‑migration, the team updated data‑access layers to support MySQL syntax, removed stored procedures, rewrote incompatible SQL, and rebuilt indexes based on TiDB’s architecture, achieving TP99 response times of 12 ms and TP99.9 of 62 ms at billion‑row scale.

Additional ecosystem work includes establishing TiDB development, operation, and release standards, building a real‑time slow‑SQL analysis tool (TiSlowSQL), creating monitoring solutions, and conducting regular training and knowledge sharing.

In summary, the migration to TiDB has delivered seamless online scaling, cost‑effective storage expansion, and a stable performance profile, while providing reusable migration experience for other teams and contributing back to the open‑source community.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Testingdistributed databaseTiDBETLdatabase migrationSQL Server
HomeTech
Written by

HomeTech

HomeTech tech sharing

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.