Optimizing Database Insert Operations and Batch Insertion Strategies
The article explains how inserts work in relational databases, why fixed‑size pages and batch transactions dramatically reduce I/O and lock contention, and provides practical MyBatis guidelines—using the foreach tag, ExecutorType.BATCH, and an appropriate batchSize while estimating memory and disk limits to choose an optimal batch size and commit only once.
In the era of massive data, choosing the right amount of data to insert in each batch is crucial for database performance.
The article first explains the fundamentals of how inserts work: data is first written to an in‑memory cache, then flushed to disk, and transaction logs (WAL) guarantee durability.
It discusses why databases use fixed‑size pages (e.g., 4KB, 8KB) and how page‑level I/O reduction improves throughput.
Single‑row inserts incur a transaction overhead for each row, while batch inserts group many rows into a single transaction, dramatically reducing I/O and lock contention.
Key factors for determining an optimal batch size include disk I/O capacity, available memory, transaction size, and lock strategy.
To estimate the maximum number of rows that can be inserted, the article walks through a sample record schema (int, varchar, date, float) and calculates average and maximum record sizes, then shows memory‑based and disk‑based capacity calculations.
Practical MyBatis recommendations are provided: use the <foreach> tag in mapper XML for bulk inserts, enable ExecutorType.BATCH in the SqlSession, and set an appropriate batchSize to avoid Out‑Of‑Memory errors.
<insert id='insertMultiple' parameterType='list'>
INSERT INTO tableName (column1, column2, ...)
VALUES
<foreach collection='list' item='record' separator=','>
(#{record.column1}, #{record.column2}, ...)
</foreach>
</insert>Example of opening a batch session:
SqlSession session = sqlSessionFactory.openSession(ExecutorType.BATCH);Finally, the article advises to minimize session commits during bulk operations and to commit once after all data is inserted.
These guidelines help developers design efficient, stable bulk‑insert pipelines for relational databases.
Java Tech Enthusiast
Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.