How to Achieve 570,000 Writes per Second in MySQL with TokuDB
This article explains how to load over 20 billion rows into MySQL using the XeLabs‑compiled TokuDB engine, detailing configuration tweaks, table schema, bulk‑loader commands, benchmark results that reach 574 k rows per second, and practical tips for handling auto‑increment keys.
Requirement
Load >2 billion rows from a big‑data platform into MySQL for next‑day reporting, requiring very high write throughput.
Re‑analysis and storage‑engine choice
InnoDB can sustain 100‑150 k inserts /s when data fits in memory, but performance drops sharply when data exceeds memory. XeLabs‑compiled TokuDB was evaluated for better write characteristics.
XeLabs TokuDB overview
Project repository: https://github.com/XeLabs/tokudb
Built‑in jemalloc memory allocator
Additional TokuDB performance metrics
Supports XtraBackup for backups
Integrates ZSTD compression
Enables binlog_group_commit
Test table and configuration
loose_tokudb_cache_size=4G
loose_tokudb_directio=ON
loose_tokudb_fsync_log_period=1000
tokudb_commit_sync=0 CREATE TABLE user_summary (
user_id BIGINT(20) UNSIGNED NOT NULL COMMENT '用户id/手机号',
weight VARCHAR(5) DEFAULT NULL COMMENT '和码体重(KG)',
level VARCHAR(20) DEFAULT NULL COMMENT '重量级',
beat_rate VARCHAR(12) DEFAULT NULL COMMENT '击败率',
level_num INT(10) DEFAULT NULL COMMENT '同吨位人数',
UNIQUE KEY u_user_id (user_id)
) ENGINE=TokuDB DEFAULT CHARSET=utf8;Data loading and benchmark
LOAD DATA INFILE '/u01/work/134-136.txt'
INTO TABLE user_summary
(user_id, weight, level, beat_rate, level_num);200 million rows were loaded in 5 min 48.30 s → 574 217 rows/s. Source file 8.5 GB, resulting TokuDB files 3.5 GB (~40 % compression). Extrapolation suggests ~58 minutes for 2 billion rows, meeting the requirement. In a comparable InnoDB test on similar hardware, the same workload took 3–4 × longer.
Effect of auto‑increment primary key
When an AUTO_INCREMENT column is added, the bulk‑loader cannot be used; loading 200 million rows took 22 min 43 s, dramatically slower.
Recommendations
If an AUTO_INCREMENT column is present, consider removing it and using a UNIQUE index to retain bulk‑loader performance.
Bulk‑loader works on empty tables or when the AUTO_INCREMENT column already contains values.
When maximum speed is required, compression settings may be relaxed.
Test environment
CentOS 7, XeLabs‑compiled TokuDB (GitHub link above), 8 CPU cores, 8 GB RAM, 500 GB high‑performance SSD.
Reference for TokuDB bulk loader: https://github.com/percona/PerconaFT/wiki/TokuFT-Bulk-Loader
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
