Implementing a MySQL Data Consistency Check Tool Based on pt-table-checksum
This article describes the design, implementation, and usage of a custom MySQL data consistency checking tool that extends pt-table-checksum, handling composite primary keys, network throttling, scheduled execution, and detailed logging for both successful and failed verification scenarios.
Requirement introduction: two MySQL clusters synced via MQ experienced primary‑key conflicts, prompting a need to verify data consistency.
Background: The author initially planned to use pt‑table‑checksum via DSN but discovered limitations, leading to a custom implementation that iterates tables by data blocks and compares checksums.
Testing DSN: A master‑slave environment was set up using MySQL 5.7.26 and Percona‑toolkit‑3.2.1. Example command shows successful checksum execution on a replica:
$ ./bin/pt-table-checksum h='192.168.1.1',u='mydba',p='test123456',P=6666 --nocheck-replication-filters --replicate=test.checksums --no-check-binlog-format -d dbatest1 --recursion-method dsn=t=percona_schema.dsnsNon‑master‑slave case: Stopping replication caused pt‑table‑checksum to fail, highlighting the need for a more flexible tool.
Development challenges: Handling composite primary keys, determining data‑block boundaries, and generating appropriate WHERE clauses for integer and string keys. The solution gathers primary‑key columns, detects their data types, and builds dynamic query strings, e.g.:
pri_name=($(mysql_comm -NBe "select COLUMN_NAME from information_schema.columns where table_name='table' and table_schema='db' and COLUMN_KEY='PRI';"))
for tmp in ${pri_name[@]}; do
__="select DATA_TYPE from information_schema.columns where table_schema='db' and table_name='table' and COLUMN_KEY='PRI' and COLUMN_NAME='${tmp}'"
pri_type["${tmp}"]="$(mysql_comm -NBe "${__}" 2>/dev/null)"
doneTool enhancements: Removed FOR UPDATE locking, added table‑structure verification, parallel table processing, network monitoring with flow‑threshold pausing, scheduled execution windows, support for arbitrary MySQL‑compatible nodes, timeout and self‑kill mechanisms, and logging of inconsistent blocks.
Usage guide: Installation steps, configuration file editing, directory layout, command‑line options for database/table selection, concurrency, time windows, and execution flags. Example commands illustrate starting network monitoring and running the checksum script:
bash manager.sh -a start -t eth0 -n 30 bash start.sh -d dba -t dbatest1 -f trueTest cases: Demonstrations of successful and failed checks, log formats, directory contents (info.log, list, md5, pri, res), and interpretation of diff and skip logs.
Conclusion: The tool extends pt‑table‑checksum functionality for broader scenarios but is intended for learning and should be thoroughly tested before production use.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.