Introducing and Evaluating the DBLE Split Command for Accelerated Data Import
This article explains the DBLE split feature that partitions large mysqldump files according to sharding configuration, provides command syntax and usage examples, and presents a performance test showing that split‑based import can be up to 18 times faster than direct DBLE import while preserving data correctness.
The DBLE split function acts as an import accelerator by dividing a large mysqldump file into multiple shard‑specific dump sub‑files based on the sharding configuration defined in sharding.xml . After splitting, each sub‑file can be imported directly into its corresponding backend MySQL instance, after which DBLE metadata is reloaded.
Basic Usage
To use the split command, log into the DBLE management port (9066) and execute the following syntax:
mysql > split src dest [-sschema] [-r500] [-w512] [-l10000] [--ignore] [-t2]Parameters:
src : original dump file name
dest : directory for generated dump files
-s : default logical database name when the dump lacks schema statements
-r : read queue size (default 500)
-w : write queue size (default 512, must be a power of two)
-l : maximum number of VALUES per INSERT for sharded tables (default 4000)
--ignore : ignore duplicate rows on insert
-t : thread pool size for concurrent processing
Example commands:
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1 -r600;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1 -r600 -w1024;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1 -r600 -w1024 -l10000;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1 -r600 -w1024 -l10000 --ignore;
mysql > split /path-to-mysqldump-file/mysqldump.sql /tmp/dump-dir-name -sdatabase1 -r600 -w1024 -l10000 --ignore -t4;Performance Evaluation
A three‑group experiment compared import times and data integrity:
Control Group 1: Direct MySQL import without DBLE – 13,181 s.
Control Group 2: Direct DBLE import without splitting – 50,883 s.
Experiment Group: Split + concurrent import to backend MySQL – 2,751 s (912 s + 1,839 s).
All groups produced identical row counts for the 10 benchmark tables, and checksum values matched between Control Group 2 and the Experiment Group, confirming data correctness.
Results
Import speed: Split‑based import is ~5× faster than direct MySQL import and ~18× faster than direct DBLE import, achieving roughly 98 GB/h.
Data integrity: No differences in row counts or checksums, demonstrating that the split process does not lose data.
Conclusion
When the split command runs on a capable machine and backend MySQL servers are sufficient, import performance can be dramatically improved. Strategies such as increasing shard count, choosing an even distribution algorithm, or transferring dump sub‑files to the backend before import can further boost speed. Limitations include lack of support for explicit child tables, views, and potential issues with global sequence tables.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.