How to Import 1 Billion Records into MySQL Efficiently: Design, Sharding, and Performance Tips
This article explains how to import 1 billion 1 KB log records stored in HDFS/S3 into MySQL by analyzing single‑table limits, proposing sharding into multiple tables, choosing the right storage engine, using batch inserts, controlling concurrency, and ensuring ordered, reliable processing with Redis and Redisson.
Problem Statement and Constraints
We need to import 1 billion rows (≈1 KB each) of unstructured log data stored in HDFS or S3. The data is already split into roughly 100 files, each file is ordered and should be written to MySQL while preserving order and avoiding duplicates.
Each row is 1 KB.
Data is unstructured log that must be parsed before insertion.
Source resides in HDFS or S3.
Files are split into 100 parts with sequential suffixes.
Import must be ordered and preferably deduplicated.
Target database is MySQL.
Can a Single MySQL Table Hold 1 Billion Rows?
No. A practical limit for a single InnoDB table is around 20 million rows (≈2 × 10⁷). The limit comes from the B+‑tree index depth: with a leaf page size of 16 KB and a 1 KB row, each leaf holds 16 rows; a non‑leaf page (also 16 KB) can hold about 1 170 pointers (16 KB / 14 B). Beyond three index levels, performance degrades sharply.
Therefore we design the schema to use 1 KW (≈1 000 rows) per table, resulting in about 100 tables for the full dataset.
Efficient Writing Strategies
Single‑row inserts are slow. Use batch inserts (e.g., 100 rows per batch). InnoDB guarantees atomicity for a batch within a transaction, so either all rows are written or none.
Retry failed batches; after N retries, fall back to single‑row inserts, log the failures, and discard if necessary.
Write rows in primary‑key order; avoid non‑primary indexes during the load. Create secondary indexes after the bulk load completes.
Should We Write Concurrently to the Same Table?
Do not. Concurrent writes break ordering and cause index contention. Instead, increase the batch size to raise effective concurrency.
Choosing the MySQL Storage Engine
MyISAMoffers higher raw insert speed but lacks transaction support, making it unsuitable for reliable bulk loads. InnoDB with innodb_flush_log_at_trx_commit set to 0 or 2 can approach MyISAM performance while still providing ACID guarantees (with a possible 1‑second data loss window).
Do We Need Database Sharding?
Single‑instance MySQL typically caps at ~5 K TPS. With SSDs the limit is higher; HDDs suffer from head contention when multiple tables are written concurrently. Therefore we design a flexible sharding scheme that can configure the number of databases and the number of tables written in parallel.
Configurable number of databases.
Configurable number of tables per database (e.g., limit to one table per DB on HDD).
Ensuring Ordered Writes
Use a composite primary key built from the file suffix and the line number (e.g., index_90.txt → database_9.table_0). This guarantees that rows from the same file go to the same table and that the global order can be reconstructed from the table and file identifiers.
Fast File Reading
Reading a 10 GB file cannot be done in one shot. Options include: Files.readAllBytes – loads entire file (not feasible). FileReader + BufferedReader – line‑by‑line reading. File + BufferedReader. Scanner – line‑by‑line.
Java NIO FileChannel with a fixed‑size ByteBuffer.
Benchmarks on macOS show FileChannel is fastest, but it does not preserve line boundaries. For simplicity and acceptable performance (≈30 s for a 10 GB file), BufferedReader is chosen.
File file = new File("/xxx.zip");
FileInputStream fileInputStream = null;
long now = System.currentTimeMillis();
try {
fileInputStream = new FileInputStream(file);
FileChannel fileChannel = fileInputStream.getChannel();
int capacity = 1 * 1024 * 1024; // 1M
ByteBuffer byteBuffer = ByteBuffer.allocate(capacity);
StringBuffer buffer = new StringBuffer();
int size = 0;
while (fileChannel.read(byteBuffer) != -1) {
byteBuffer.clear();
byte[] bytes = byteBuffer.array();
size += bytes.length;
}
System.out.println("file size:" + size);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
// TODO close resources.
}
System.out.println("Time:" + (System.currentTimeMillis() - now));Task Reliability and Progress Tracking
If a reader crashes or the DB becomes unavailable, we must resume without duplication. The primary key (file suffix + line number) provides idempotency.
Use Redis to store the current offset for each task: INCRBY KEY_NAME INCR_AMOUNT Example: INCRBY task_offset_5 100. On batch failure, retry; after repeated failures, fall back to single‑row inserts and update Redis accordingly. Optionally, consume MySQL binlog to keep Redis in sync.
Coordinating Task Concurrency
Each node polls a task table and attempts to acquire a Redisson semaphore keyed by the database ID. The semaphore limits the number of concurrent writes per DB.
RedissonClient redissonClient = Redisson.create(config);
RSemaphore rSemaphore = redissonClient.getSemaphore("semaphore");
// Set concurrency to 1
rSemaphore.trySetPermits(1);
rSemaphore.tryAcquire(); // non‑blocking acquireIf a semaphore is not released due to a crash, a timeout can be set, but Redisson does not support lease renewal for semaphores. As an alternative, elect a master node via Zookeeper+Curator to assign tasks and hold a distributed lock that can be renewed.
Overall Design Summary
Clarify interview constraints before designing the solution.
Split the 1 billion rows into multiple databases/tables (≈100 tables) to avoid single‑table limits.
Use batch inserts with configurable size; test thresholds (100, 1 000, 10 000) to find the optimal value.
Prefer InnoDB with innodb_flush_log_at_trx_commit set to 0/2 for speed, but fall back to MyISAM if the cluster forbids changing the setting.
Limit per‑database concurrent writes; adjust dynamically based on SSD/HDD characteristics.
Track task progress in Redis to guarantee idempotent retries.
Combine reading and writing in the same task to avoid complex Kafka pipelines.
Use a master node (Zookeeper+Curator) to coordinate task assignment and avoid semaphore timeout issues.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
