Boost MyBatis-Plus Batch Inserts by 2000% with rewriteBatchedStatements and ID Pre‑Generation
By configuring MySQL’s rewriteBatchedStatements=true, pre‑generating primary keys, adjusting MyBatis executor settings, and employing asynchronous multithreaded batch inserts, this guide demonstrates how to accelerate MyBatis‑Plus saveBatch operations from seconds to milliseconds, achieving up to a 2000% performance gain in large‑scale data scenarios.
Problem Overview
Bulk inserting large numbers of rows in high‑traffic Java applications (e.g., exam systems with exam, question, option tables) often suffers from excessive round‑trips, foreign‑key lookups, and limited use of JDBC batch capabilities.
MyBatis‑Plus saveBatch Method
boolean saveBatch(Collection<T> entityList);
boolean saveBatch(Collection<T> entityList, int batchSize); entityListis any collection of entities; batchSize controls how many records are sent per batch. Without a size, all records are sent at once, which may cause memory pressure.
Limitations of the Default Implementation
SQL statements are often sent one by one, so the driver cannot merge them.
Auto‑generated primary keys require a separate query per row.
Foreign‑key relationships must be resolved after each insert.
No support for conditional inserts or upserts.
JDBC Batch Mechanism and rewriteBatchedStatements=true
JDBC allows a PreparedStatement to collect many statements and execute them together:
// Create PreparedStatement
PreparedStatement pstmt = conn.prepareStatement(sql);
for (Data d : dataList) {
pstmt.addBatch();
}
// Execute batch
pstmt.executeBatch();MySQL’s driver, when rewriteBatchedStatements=true (supported from driver 5.1.13), rewrites multiple identical INSERTs into a single multi‑value INSERT, reducing network trips and parsing overhead.
-- Without rewrite
INSERT INTO question (exam_id, content) VALUES (?, ?);
INSERT INTO question (exam_id, content) VALUES (?, ?);
INSERT INTO question (exam_id, content) VALUES (?, ?);
-- With rewrite
INSERT INTO question (exam_id, content) VALUES (?, ?), (?, ?), (?, ?);Pre‑Generating Primary Keys
When child rows (e.g., option) reference a parent row ( question), waiting for the database to generate the parent ID forces a sequential workflow. Using a distributed ID generator (e.g., zzidc) eliminates this round‑trip.
<!-- Maven dependency for zzidc -->
<dependency>
<groupId>com.bj58.zhuanzhuan.idc</groupId>
<artifactId>contract</artifactId>
<version>${com.bj58.zhuanzhuan.idc.version}</version>
</dependency> public Boolean createExamPaper(HeroExamRequest<ExamPaperRequest> request) {
// Build question with pre‑generated ID
Question q = new Question();
q.setId(questionId);
q.setExamId(examId);
// Build option referencing the same ID
Option o = new Option();
o.setQuestionId(questionId);
// ... persist ...
}Comprehensive Optimization Example
Configuration
Add the parameter to the JDBC URL, e.g.:
jdbc:mysql://localhost:3306/db_name?rewriteBatchedStatements=trueService Implementation
@Service
public class ExamServiceImpl implements ExamService {
@Autowired private ExamMapper examMapper;
@Autowired private QuestionService questionService;
@Autowired private OptionService optionService;
private static final int BATCH_SIZE = 2000;
@Transactional(rollbackFor = Exception.class)
public void createExam(Exam exam, int questionCount, int optionCountPerQuestion) {
long examId = zzidc.nextId();
exam.setId(examId);
examMapper.insert(exam);
List<Question> questionList = new ArrayList<>();
List<Option> optionList = new ArrayList<>();
for (int i = 0; i < questionCount; i++) {
long questionId = zzidc.nextId();
Question q = new Question();
q.setId(questionId);
q.setExamId(examId);
q.setContent("Question " + i);
questionList.add(q);
for (int j = 0; j < optionCountPerQuestion; j++) {
Option o = new Option();
o.setQuestionId(questionId);
o.setContent("Option " + j);
optionList.add(o);
}
}
questionService.saveBatch(questionList, BATCH_SIZE);
optionService.saveBatch(optionList, BATCH_SIZE);
}
}Performance Test
Unoptimized (row‑by‑row): ~4023 ms
Only saveBatch (no rewrite, no pre‑ID): ~2744 ms (≈30 % faster)
Full optimization (rewrite + pre‑ID + batch): ~149 ms (≈2700 % faster)
Concurrent Insertion with Spring @Async
Thread‑Pool Configuration
@Configuration
@EnableAsync
public class AsyncConfig {
@Bean(name = "taskExecutor")
public Executor taskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(4);
executor.setMaxPoolSize(8);
executor.setQueueCapacity(100);
executor.setThreadNamePrefix("AsyncExecutor-");
executor.initialize();
return executor;
}
}Async Batch Method
@Service
public class QuestionServiceImpl implements QuestionService {
@Autowired private QuestionMapper questionMapper;
private static final int BATCH_SIZE = 2000;
@Async("taskExecutor")
@Transactional(rollbackFor = Exception.class)
public CompletableFuture<Void> saveBatchAsync(List<Question> batch) {
saveBatch(batch, BATCH_SIZE);
return CompletableFuture.completedFuture(null);
}
}Calling the Async Methods
public void createExam(Exam exam, int qCount, int optPerQ) {
// ...prepare data as in the previous example...
List<List<Question>> qBatches = Lists.partition(questionList, BATCH_SIZE);
List<List<Option>> oBatches = Lists.partition(optionList, BATCH_SIZE);
List<CompletableFuture<Void>> futures = new ArrayList<>();
for (List<Question> batch : qBatches) {
futures.add(questionService.saveBatchAsync(batch));
}
for (List<Option> batch : oBatches) {
futures.add(optionService.saveBatchAsync(batch));
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
}Key Points
Each async call gets its own SqlSession and transaction, ensuring thread safety.
Collect CompletableFuture results to guarantee that all batches finish before proceeding.
Database‑Level Tuning
Connection‑Pool Adjustment
Increase pool size to match the level of concurrency.
Balance min/max settings to avoid idle resources or exhaustion.
MyBatis Executor
Set the default executor type to BATCH in mybatis-config.xml (requires manual sqlSession.flushStatements() handling):
<configuration>
<settings>
<setting name="defaultExecutorType" value="BATCH"/>
</settings>
</configuration>Monitoring & Tuning
Async Task Monitoring
@Async("taskExecutor")
@Transactional(rollbackFor = Exception.class)
public CompletableFuture<Void> saveBatchAsync(List<Question> list) {
long start = System.currentTimeMillis();
saveBatch(list, BATCH_SIZE);
long end = System.currentTimeMillis();
logger.info("Inserted batch of {} questions in {} ms", list.size(), (end - start));
return CompletableFuture.completedFuture(null);
}Thread‑Pool Parameter Tuning
Set corePoolSize and maxPoolSize based on CPU cores and DB capacity.
Configure queueCapacity to prevent OOM under heavy load.
Best‑Practice Summary
Enable rewriteBatchedStatements=true to let the MySQL driver merge INSERT statements.
Pre‑generate primary keys (e.g., with zzidc) to break foreign‑key dependencies.
Use Spring @Async with a dedicated thread pool for safe concurrent batch inserts.
Adjust the database connection pool and set MyBatis executor to BATCH for maximum throughput.
Log and monitor async tasks to detect and resolve bottlenecks early.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
