Using sync-diff-inspector for MySQL/TiDB Data Validation and Repair
This article introduces the open‑source sync‑diff‑inspector tool, explains how to download and install it, and provides detailed configuration and command‑line examples for schema‑level, table‑level, and range‑based data validation between MySQL instances, including automatic SQL repair generation and common troubleshooting tips.
1. Introduction to sync-diff-inspector
sync-diff-inspector is an open‑source data verification tool from PingCAP that compares the structure and data of two MySQL/TiDB datasets, generates SQL statements to fix inconsistencies, and supports cross‑database, sharding, and TiDB cluster scenarios.
Compare table structures and data
Generate repair SQL when data differs
Validate data across different database or table names
Support sharding and TiDB master‑slave clusters
Validate data pulled from TiDB DM configurations
Limitations include the need for offline validation (no writes during comparison), handling of floating‑point columns, and potential repair issues for tables without primary keys.
2. Installing sync-diff-inspector
#sync-diff-inspector is bundled in the TiDB toolkit; download and extract it
wget https://download.pingcap.org/tidb-community-toolkit-v6.4.0-linux-amd64.tar.gz
tar zxvf tidb-community-toolkit-v6.4.0-linux-amd64.tar.gz
ls -lh tidb-community-toolkit-v6.4.0-linux-amd64 | grep sync_diff_inspector
./sync_diff_inspector -V3. Usage Examples
3.1 Common configuration file
cat config.toml
check-thread-count = 4
export-fix-sql = true
check-struct-only = false
[data-sources]
[data-sources.mysql1]
host = "10.186.65.57"
port = 3306
user = "sgy"
password = "admin"
route-rules = ["rule1"]
[data-sources.mysql2]
host = "10.186.65.89"
port = 3309
user = "sgy"
password = "admin"3.2 Schema‑based validation
Validate a single schema:
# routes configuration for a single schema
[routes]
[routes.rule1]
schema-pattern = "sbtest"
target-schema = "sbtest"
[task]
output-dir = "./output"
source-instances = ["mysql1"]
target-instance = "mysql2"
target-check-tables = ["sbtest.*"]
./sync_diff_inspector --config=./config.tomlValidate multiple schemas by adding additional rules (rule2, etc.) and adjusting target-check-tables accordingly.
3.3 Table‑based validation
Validate a single table:
# routes for a specific table
[routes]
[routes.rule1]
schema-pattern = "sbtest"
target-schema = "sbtest"
[task]
output-dir = "./output"
source-instances = ["mysql1"]
target-instance = "mysql2"
target-check-tables = ["sbtest.sbtest1"]
./sync_diff_inspector --config=./config.tomlValidate multiple tables by listing them in target-check-tables or using wildcard patterns.
3.4 Range‑based validation
# additional table‑config for range
[table-configs.config1]
target-tables = ["sbtest.sbtest1"]
range = "id > 10 AND id < 100"Run the tool with the same command; only rows matching the range are compared.
3.5 Data repair
After a mismatch is detected, the tool can output SQL statements to synchronize the target table. Example repair SQL snippets are shown below:
-- table: sbtest.sbtest1
-- range in sequence: Full
REPLACE INTO `sbtest`.`sbtest1`(`id`,`k`,`c`,`pad`) VALUES (11,50148,'...','...');
... (additional REPLACE statements) ...Import the generated .sql file into the target MySQL instance, then re‑run the validation to confirm equality.
4. Common Issues
After modifying the configuration file, delete the previous output-dir to avoid checkpoint conflicts.
Use utf8mb4 character set; TiDB does not support MySQL 8.0 utf8mb3 , which leads to initialization errors.
For more details, refer to the official sync‑diff‑inspector documentation: https://docs.pingcap.com/zh/tidb/stable/sync-diff-inspector-overview
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.