Databases 5 min read

How to Load 2 Billion Rows into MySQL Fast with TokuDB – 57k Rows/s Benchmark

This article describes a real‑world test of loading over 2 billion records from a big‑data platform into MySQL using XeLabs TokuDB, showing configuration details, performance results, and practical tips for achieving up to 570 k rows per second on a cloud instance.

Java Backend Technology
Java Backend Technology
Java Backend Technology
How to Load 2 Billion Rows into MySQL Fast with TokuDB – 57k Rows/s Benchmark

Requirement

A friend needed to load more than 2 billion rows from a big‑data platform into MySQL for next‑day business reporting.

Implementation Reanalysis

In MySQL, a single‑table insert can reach 100k‑150k rows/s when memory exceeds data size, but many projects exceed available memory. XeLabs TokuDB was tested as an alternative.

XeLabs TokuDB Overview

Project address: https://github.com/XeLabs/tokudb

Built‑in jemalloc memory allocator

Additional TokuDB performance metrics

Supports Xtrabackup backup

Integrates ZSTD compression algorithm

Supports TokuDB binlog_group_commit feature

Test Table

TokuDB core configuration:

Table schema:

Data loaded using LOAD DATA:

Calculated write speed:

File size comparison: original file 8.5 GB, TokuDB file 3.5 GB (≈40% of original). Loading 2 billion rows completed in about 58 minutes, meeting the requirement. In comparable InnoDB tests, TokuDB was 3‑4× faster.

File size difference illustration:

Test Conclusions

On a cloud environment with 8 CPU cores, 8 GB RAM, and a 500 GB high‑speed cloud disk, TokuDB consistently achieved up to 570 k rows per second.

When using an auto‑increment primary key, TokuDB’s bulk loader cannot be used, causing a slowdown to single‑row inserts. If the auto‑increment column already has values, consider removing the auto‑increment attribute and using a unique index to reduce overhead and improve speed. Compression may be less effective during bulk loading.

Reference for TokuDB Bulk Loader: https://github.com/percona/PerconaFT/wiki/TokuFT-Bulk-Loader

Test Environment

Tests were performed on CentOS 7. The XeLabs TokuDB version was compiled from Baidu Cloud (link: https://pan.baidu.com/s/1qYRyH3I).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance TestingmysqlTokuDBbig data loading
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.