Performance Evaluation of Inserting Billion‑Scale Data into MySQL Using Java: MyBatis vs JDBC vs Batch Processing
This article presents a detailed performance test of inserting massive data into MySQL with Java, comparing three strategies—MyBatis lightweight insertion without transactions, direct JDBC handling with and without transactions, and JDBC batch processing—showing timing results for millions of rows and offering practical recommendations for high‑throughput data loading.
The author, a senior architect, investigates how to efficiently insert large volumes of data into MySQL for query‑performance testing, generating random person records (ID, name, gender, age, email, phone, address) and inserting up to hundreds of millions of rows.
Test data generation : Random data is created using utility methods (RandomValue.getChineseName(), getNum(), getEmail(), getTel(), getRoad()) and stored in a MySQL table named person.
Three insertion strategies :
MyBatis lightweight framework insertion (no transaction)
Direct JDBC handling (with and without transaction)
JDBC batch processing (with and without transaction)
1. MyBatis lightweight insertion (no transaction)
Code example (no transaction):
private long begin = 33112001; // start id
private long end = begin + 100000; // batch size
private String url = "jdbc:mysql://localhost:3306/bigdata?useServerPrepStmts=false&rewriteBatchedStatements=true&useUnicode=true&characterEncoding=UTF-8";
private String user = "root";
private String password = "0203";
@Test
public void insertBigData2() {
ApplicationContext context = new ClassPathXmlApplicationContext("applicationContext.xml");
PersonMapper pMapper = (PersonMapper) context.getBean("personMapper");
Person person = new Person();
long bTime = System.currentTimeMillis();
for (int i = 0; i < 5000000; i++) {
person.setId(i);
person.setName(RandomValue.getChineseName());
person.setSex(RandomValue.name_sex);
person.setAge(RandomValue.getNum(1, 100));
person.setEmail(RandomValue.getEmail(4, 15));
person.setTel(RandomValue.getTel());
person.setAddress(RandomValue.getRoad());
pMapper.insert(person);
begin++;
}
long eTime = System.currentTimeMillis();
System.out.println("插入500W条数据耗时:" + (eTime - bTime));
}Result: inserting 10,000 rows took about 28.6 seconds.
2. Direct JDBC handling
Two variants are tested: without transaction and with transaction (auto‑commit disabled). Example of the transactional version:
private long begin = 33112001;
private long end = begin + 100000;
private String url = "jdbc:mysql://localhost:3306/bigdata?useServerPrepStmts=false&rewriteBatchedStatements=true&useUnicode=true&characterEncoding=UTF-8";
private String user = "root";
private String password = "0203";
@Test
public void insertBigData3() {
Connection conn = null;
PreparedStatement pstm = null;
try {
Class.forName("com.mysql.jdbc.Driver");
conn = DriverManager.getConnection(url, user, password);
conn.setAutoCommit(false);
String sql = "INSERT INTO person VALUES (?,?,?,?,?,?,?)";
pstm = conn.prepareStatement(sql);
long bTime1 = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
long bTime = System.currentTimeMillis();
while (begin < end) {
pstm.setLong(1, begin);
pstm.setString(2, RandomValue.getChineseName());
pstm.setString(3, RandomValue.name_sex);
pstm.setInt(4, RandomValue.getNum(1, 100));
pstm.setString(5, RandomValue.getEmail(4, 15));
pstm.setString(6, RandomValue.getTel());
pstm.setString(7, RandomValue.getRoad());
pstm.execute();
begin++;
}
conn.commit();
end += 10000;
long eTime = System.currentTimeMillis();
System.out.println("成功插入1W条数据耗时:" + (eTime - bTime));
}
long eTime1 = System.currentTimeMillis();
System.out.println("插入10W数据共耗时:" + (eTime1 - bTime1));
} catch (SQLException | ClassNotFoundException e) {
e.printStackTrace();
}
}Results:
Without transaction: ~21.2 seconds per 10,000 rows.
With transaction: ~3.9 seconds per 10,000 rows.
3. JDBC batch processing
Key points: enable batch mode in the JDBC URL (rewriteBatchedStatements=true) and keep the PreparedStatement outside the loop. Sample code:
private long begin = 33112001;
private long end = begin + 100000;
private String url = "jdbc:mysql://localhost:3306/bigdata?useServerPrepStmts=false&rewriteBatchedStatements=true&useUnicode=true&characterEncoding=UTF-8";
private String user = "root";
private String password = "0203";
@Test
public void insertBigData() {
Connection conn = null;
PreparedStatement pstm = null;
try {
Class.forName("com.mysql.jdbc.Driver");
conn = DriverManager.getConnection(url, user, password);
// conn.setAutoCommit(false); // optional
String sql = "INSERT INTO person VALUES (?,?,?,?,?,?,?)";
pstm = conn.prepareStatement(sql);
long bTime1 = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
long bTime = System.currentTimeMillis();
while (begin < end) {
pstm.setLong(1, begin);
pstm.setString(2, RandomValue.getChineseName());
pstm.setString(3, RandomValue.name_sex);
pstm.setInt(4, RandomValue.getNum(1, 100));
pstm.setString(5, RandomValue.getEmail(4, 15));
pstm.setString(6, RandomValue.getTel());
pstm.setString(7, RandomValue.getRoad());
pstm.addBatch();
begin++;
}
pstm.executeBatch();
// conn.commit(); // if transaction is enabled
end += 100000;
long eTime = System.currentTimeMillis();
System.out.println("成功插入10W条数据耗时:" + (eTime - bTime));
}
long eTime1 = System.currentTimeMillis();
System.out.println("插入100W数据共耗时:" + (eTime1 - bTime1));
} catch (SQLException | ClassNotFoundException e) {
e.printStackTrace();
}
}Results:
Batch without transaction: ~2.1 seconds per 100,000 rows.
Batch with transaction: ~1.9 seconds per 100,000 rows.
Overall Conclusions
When inserting massive single‑row data, combining JDBC batch processing with transactions yields the highest throughput. In the author’s experiments, batch + transaction inserted 100 million rows in about 174 seconds, outperforming the other methods.
Additional observations:
MyBatis performed poorly without transaction (≈10× slower than JDBC batch).
Direct JDBC without transaction was 5× slower than with transaction.
Batch processing narrowed the gap between transactional and non‑transactional modes.
Recommendation: for large‑scale data loading, enable both batch mode and explicit transaction management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
