How to Cut MyBatis Batch Insert Time from 5 Minutes to 3 Seconds: Three Key Optimizations
The article walks through turning a naïve MyBatis loop that took five minutes to insert 100,000 rows into a high‑performance solution that finishes in three seconds by applying batch SQL, JDBC batch mode with rewriteBatchedStatements, and multithreaded parallel execution, while highlighting pitfalls and best‑practice configurations.
Initial implementation (5 minutes)
The original code iterated over 100,000 User objects and called userMapper.insert(user) for each record, causing a network request, SQL parsing, and transaction commit per row. Assuming 3 ms per insert, the total time is about 300 seconds (5 minutes).
for (User user : userList) {
userMapper.insert(user);
}First optimization: batch SQL (30 seconds)
Replace the per‑row inserts with a single batch INSERT statement generated by MyBatis <foreach>. The list is split into batches of 1,000 rows to avoid overly long SQL.
<!-- Mapper.xml -->
<insert id="batchInsert">
INSERT INTO user (name, age, email) VALUES
<foreach collection="list" item="item" separator=",">
(#{item.name}, #{item.age}, #{item.email})
</foreach>
</insert> int batchSize = 1000;
for (int i = 0; i < userList.size(); i += batchSize) {
int end = Math.min(i + batchSize, userList.size());
List<User> batch = userList.subList(i, end);
userMapper.batchInsert(batch);
}Principle: a single SQL statement inserts many rows, reducing network round‑trips.
Second optimization: JDBC batch (8 seconds)
Enable MySQL driver’s rewriteBatchedStatements to merge multiple INSERT s into one, and use MyBatis’s batch executor.
Modify the JDBC URL
jdbc:mysql://localhost:3306/test?rewriteBatchedStatements=trueUse MyBatis batch executor
@Autowired
private SqlSessionFactory sqlSessionFactory;
public void batchInsertWithExecutor(List<User> userList) {
try (SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH)) {
UserMapper mapper = sqlSession.getMapper(UserMapper.class);
int batchSize = 1000;
for (int i = 0; i < userList.size(); i++) {
mapper.insert(userList.get(i));
if ((i + 1) % batchSize == 0) {
sqlSession.flushStatements();
sqlSession.clearCache();
}
}
sqlSession.flushStatements();
sqlSession.commit();
}
}Principle: ExecutorType.BATCH caches statements; rewriteBatchedStatements=true merges them into a single INSERT .
Third optimization: multithreaded parallelism (3 seconds)
Further speed up by splitting the data across multiple threads (e.g., 4 threads) and executing batch inserts concurrently.
public void parallelBatchInsert(List<User> userList) {
int threadCount = 4; // adjust to connection‑pool size
int batchSize = userList.size() / threadCount;
ExecutorService executor = Executors.newFixedThreadPool(threadCount);
List<Future<?>> futures = new ArrayList<>();
for (int i = 0; i < threadCount; i++) {
int start = i * batchSize;
int end = (i == threadCount - 1) ? userList.size() : (i + 1) * batchSize;
List<User> subList = userList.subList(start, end);
futures.add(executor.submit(() -> batchInsertWithExecutor(subList)));
}
for (Future<?> f : futures) {
try { f.get(); } catch (Exception e) { throw new RuntimeException(e); }
}
executor.shutdown();
}Performance comparison
Loop single insert – 300 s (baseline)
Batch SQL – 30 s (10× faster)
JDBC batch – 8 s (37× faster)
Multithreaded parallel – 3 s (100× faster)
Pitfalls encountered
Foreach‑generated SQL can exceed max_allowed_packet; split into batches of 500‑1000 rows. rewriteBatchedStatements may not take effect if the URL parameter is missing, ExecutorType.BATCH is not used, or the MySQL driver is outdated.
When rewriteBatchedStatements=true, auto‑generated key retrieval can be unreliable; upgrade the driver to MySQL 8.0.17+.
Loading all 100 k rows at once may cause OOM; use pagination (e.g., pageSize = 10000) and batch insert each page.
Final implementation
@Service
public class BatchInsertService {
@Autowired
private SqlSessionFactory sqlSessionFactory;
public void highPerformanceBatchInsert(List<User> userList) {
if (userList == null || userList.isEmpty()) return;
int threadCount = Math.min(4, Runtime.getRuntime().availableProcessors());
int batchSize = (int) Math.ceil((double) userList.size() / threadCount);
ExecutorService executor = Executors.newFixedThreadPool(threadCount);
CountDownLatch latch = new CountDownLatch(threadCount);
for (int i = 0; i < threadCount; i++) {
int start = i * batchSize;
int end = Math.min((i + 1) * batchSize, userList.size());
if (start >= userList.size()) { latch.countDown(); continue; }
List<User> subList = new ArrayList<>(userList.subList(start, end));
executor.submit(() -> {
try { doBatchInsert(subList); } finally { latch.countDown(); }
});
}
try { latch.await(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); }
executor.shutdown();
}
private void doBatchInsert(List<User> userList) {
try (SqlSession sqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false)) {
UserMapper mapper = sqlSession.getMapper(UserMapper.class);
for (int i = 0; i < userList.size(); i++) {
mapper.insert(userList.get(i));
if ((i + 1) % 1000 == 0) {
sqlSession.flushStatements();
sqlSession.clearCache();
}
}
sqlSession.flushStatements();
sqlSession.commit();
}
}
}Core principle: Reduce network round‑trips, reduce transaction count, and parallelize processing.
Reference
Source: https://blog.csdn.net/zhangxianhau/article/details/156057385
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
