MySQL Billion‑Row Data Migration: Export, Import, and Performance Optimization
This article documents a step‑by‑step MySQL data migration case involving 140 million rows, covering export via SELECT INTO OUTFILE, import with LOAD DATA INFILE, and a series of performance tweaks such as index removal, session variable tuning, engine selection, and monitoring to achieve sub‑10‑minute load times.
The author records a MySQL data migration of 140 million rows (≈4.3 GB) as a learning note, explaining the differences between export and import processes and why large‑scale migrations differ from typical BI report loads.
Export methods include using client tools like Navicat (≈2.5 M rows/min) and MySQL's native
SELECT ... FROM TABLE_A -- can add WHERE clause<br/>INTO OUTFILE "/path/to/file" -- export file location<br/>FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' -- field delimiter and optional quoting<br/>LINES TERMINATED BY '
'; -- line break. The author notes that field delimiters and quoting can be customized.
Import methods focus on MySQL's native
load data infile '/data/files/T_CUST_INFO.txt' -- default server‑side file<br/>ignore into table t_dq_user -- allow duplicate rows<br/>fields terminated by ',' -- comma delimiter<br/>lines terminated by '
'(CustID,DeviceNo,logintype); -- map columns. While LOAD DATA is fast, improper configuration can still cause long runtimes.
The full LOAD DATA syntax is shown, including options such as LOW_PRIORITY, CONCURRENT, LOCAL, REPLACE / IGNORE, character set, field/line delimiters, and optional IGNORE lines.
Performance bottlenecks and solutions :
Indexes on all three columns caused extreme slowdown; removing indexes before loading and recreating them afterward dramatically improved speed.
Engine choice matters: MyISAM can stream data to slaves faster than InnoDB, though InnoDB is the default and generally recommended.
For MyISAM, increase memory buffers with SET SESSION BULK_INSERT_BUFFER_SIZE = 256217728; and SET SESSION MYISAM_SORT_BUFFER_SIZE = 256217728;.
Disable unique checks during load with SET UNIQUE_CHECKS=0; and re‑enable afterward.
Transfer large files to the server via FTP to avoid network latency during load.
Monitor server CPU with top; a healthy load shows MySQL using ~110% CPU (full core), whereas a stalled load may only use ~10%.
Beware of MySQL's max transaction size (default ~150 M rows); large loads may hit this limit and cause instance restarts.
In summary, the most impactful optimization was dropping indexes before the import, which turned a multi‑hour operation into a 7‑minute load, emphasizing the importance of index management, engine settings, and system monitoring for massive data migrations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
