Efficient Import and Export of Millions of Records Using POI and EasyExcel in Java
This article explains how to handle massive Excel‑DB import/export tasks in Java by comparing POI workbook types, selecting the right implementation, and leveraging EasyExcel with batch queries, sheet splitting, and JDBC batch inserts to process over three million rows efficiently.
In many projects data needs to be imported from Excel into a database or exported from a database to Excel, and handling millions of rows can cause memory overflow and poor performance.
1. Traditional POI versions and their pros/cons
POI provides three main workbook implementations:
HSSFWorkbook : works with Excel 2003 (xls) files, limited to 65,535 rows, but does not cause memory overflow for small datasets.
XSSFWorkbook : supports Excel 2007+ (xlsx) files, can handle up to 1,048,576 rows, but stores all data in memory, leading to possible OutOfMemory errors for large datasets.
SXSSFWorkbook : introduced in POI 3.8, writes data to disk to keep memory usage low, suitable for very large files, but does not support some features such as sheet.clone(), formula evaluation, and modifying headers after writing.
2. Choosing the appropriate workbook
Use HSSFWorkbook or XSSFWorkbook when the total rows are below 70,000. For data larger than 70,000 without complex styling, SXSSFWorkbook is recommended. If styling, formulas, or header manipulation are required for large data, split the work into multiple XSSFWorkbook batches.
3. Exporting 3 million rows with EasyExcel
EasyExcel simplifies large‑scale export. The main steps are:
Batch query the database (e.g., 200,000 rows per query).
Write each batch to a sheet; when a sheet reaches 1,000,000 rows, start a new sheet.
Calculate the number of sheets and write loops accordingly.
Key export code:
public void dataExport300w(HttpServletResponse response) {
OutputStream outputStream = null;
try {
long startTime = System.currentTimeMillis();
System.out.println("导出开始时间:" + startTime);
outputStream = response.getOutputStream();
ExcelWriter writer = new ExcelWriter(outputStream, ExcelTypeEnum.XLSX);
String fileName = new String(("excel100w").getBytes(), "UTF-8");
Table table = new Table(1);
List
> titles = new ArrayList<>();
titles.add(Arrays.asList("onlineseqid"));
titles.add(Arrays.asList("businessid"));
// ... other column titles ...
table.setHead(titles);
int count = 3000001;
Integer totalCount = actResultLogMapper.findActResultLogByCondations(count);
Integer sheetDataRows = ExcelConstants.PER_SHEET_ROW_COUNT; // 1,000,000
Integer writeDataRows = ExcelConstants.PER_WRITE_ROW_COUNT; // 200,000
Integer sheetNum = totalCount % sheetDataRows == 0 ? (totalCount / sheetDataRows) : (totalCount / sheetDataRows + 1);
Integer oneSheetWriteCount = sheetDataRows / writeDataRows;
Integer lastSheetWriteCount = totalCount % sheetDataRows == 0 ? oneSheetWriteCount : (totalCount % sheetDataRows % writeDataRows == 0 ? (totalCount / sheetDataRows / writeDataRows) : (totalCount / sheetDataRows / writeDataRows + 1));
List
> dataList = new ArrayList<>();
for (int i = 0; i < sheetNum; i++) {
Sheet sheet = new Sheet(i, 0);
sheet.setSheetName("测试Sheet1" + i);
for (int j = 0; j < (i != sheetNum - 1 ? oneSheetWriteCount : lastSheetWriteCount); j++) {
dataList.clear();
PageHelper.startPage(j + 1 + oneSheetWriteCount * i, writeDataRows);
List
reslultList = actResultLogMapper.findByPage100w();
if (!CollectionUtils.isEmpty(reslultList)) {
reslultList.forEach(item -> {
dataList.add(Arrays.asList(item.getOnlineseqid(), item.getBusinessid(), item.getBecifno(), item.getIvisresult(), item.getCreatedby(), Calendar.getInstance().getTime().toString(), item.getUpdateby(), Calendar.getInstance().getTime().toString(), item.getRisklevel()));
});
}
writer.write0(dataList, sheet, table);
}
}
response.setHeader("Content-Disposition", "attachment;filename=" + new String((fileName).getBytes("gb2312"), "ISO-8859-1") + ".xlsx");
response.setContentType("multipart/form-data");
response.setCharacterEncoding("utf-8");
writer.finish();
outputStream.flush();
long endTime = System.currentTimeMillis();
System.out.println("导出结束时间:" + endTime + "ms");
System.out.println("导出所用时间:" + (endTime - startTime) / 1000 + "秒");
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (outputStream != null) {
try { outputStream.close(); } catch (Exception e) { e.printStackTrace(); }
}
}
}The test exported 3 million rows in about 2 minutes 15 seconds, producing a 163 MB file.
4. Test environment and database
The database used was Oracle 19c (MySQL would give similar results for < 100 million rows). The test machine specifications are shown in the attached screenshots.
Sample DDL:
create table ACT_RESULT_LOG (
onlineseqid VARCHAR2(32),
businessid VARCHAR2(32),
becifno VARCHAR2(32),
ivisresult VARCHAR2(32),
createdby VARCHAR2(32),
createddate DATE,
updateby VARCHAR2(32),
updateddate DATE,
risklevel VARCHAR2(32)
) tablespace STUDY_KAY pctfree 10 initrans 1 maxtrans 255;5. Importing 3 million rows with EasyExcel
Import is performed by reading the Excel file in batches (e.g., 200,000 rows) and inserting each batch into the database using JDBC batch statements wrapped in a transaction.
Key listener code:
@Test
public void import2DBFromExcel10wTest() {
String fileName = "D:\\StudyWorkspace\\JavaWorkspace\\java_project_workspace\\idea_projects\\SpringBootProjects\\easyexcel\\exportFile\\excel300w.xlsx";
long startReadTime = System.currentTimeMillis();
System.out.println("------开始读取Excel的Sheet时间(包括导入数据过程):" + startReadTime + "ms------");
EasyExcel.read(fileName, new EasyExceGeneralDatalListener(actResultLogService2)).doReadAll();
long endReadTime = System.currentTimeMillis();
System.out.println("------结束读取Excel的Sheet时间(包括导入数据过程):" + endReadTime + "ms------");
}
public class EasyExceGeneralDatalListener extends AnalysisEventListener
> {
private ActResultLogService2 actResultLogService2;
private List
> dataList = new ArrayList<>();
public EasyExceGeneralDatalListener() {}
public EasyExceGeneralDatalListener(ActResultLogService2 actResultLogService2) { this.actResultLogService2 = actResultLogService2; }
@Override
public void invoke(Map
data, AnalysisContext context) {
dataList.add(data);
if (dataList.size() >= ExcelConstants.GENERAL_ONCE_SAVE_TO_DB_ROWS) {
saveData();
dataList.clear();
}
}
private void saveData() { actResultLogService2.import2DBFromExcel10w(dataList); dataList.clear(); }
@Override
public void doAfterAllAnalysed(AnalysisContext context) { saveData(); dataList.clear(); }
}JDBC utility class (simplified):
public class JDBCDruidUtils {
private static DataSource dataSource;
static {
Properties pro = new Properties();
try {
pro.load(JDBCDruidUtils.class.getClassLoader().getResourceAsStream("druid.properties"));
dataSource = DruidDataSourceFactory.createDataSource(pro);
} catch (Exception e) { e.printStackTrace(); }
}
public static Connection getConnection() throws SQLException { return dataSource.getConnection(); }
public static void close(Connection connection, Statement statement) {
if (connection != null) try { connection.close(); } catch (SQLException e) { e.printStackTrace(); }
if (statement != null) try { statement.close(); } catch (SQLException e) { e.printStackTrace(); }
}
public static void close(Connection connection, Statement statement, ResultSet resultSet) {
close(connection, statement);
if (resultSet != null) try { resultSet.close(); } catch (SQLException e) { e.printStackTrace(); }
}
public static DataSource getDataSource() { return dataSource; }
}Service method that performs the batch insert:
@Override
public Map
import2DBFromExcel10w(List
> dataList) {
HashMap
result = new HashMap<>();
if (dataList.isEmpty()) { result.put("empty", "0000"); return result; }
Connection conn = null;
PreparedStatement ps = null;
try {
long startTime = System.currentTimeMillis();
System.out.println(dataList.size() + "条,开始导入到数据库时间:" + startTime + "ms");
conn = JDBCDruidUtils.getConnection();
conn.setAutoCommit(false);
String sql = "insert into ACT_RESULT_LOG (onlineseqid,businessid,becifno,ivisresult,createdby,createddate,updateby,updateddate,risklevel) values (?,?,?,?,?,?,?,?,?)";
ps = conn.prepareStatement(sql);
for (int i = 0; i < dataList.size(); i++) {
Map
item = dataList.get(i);
ps.setString(1, item.get(0));
ps.setString(2, item.get(1));
ps.setString(3, item.get(2));
ps.setString(4, item.get(3));
ps.setString(5, item.get(4));
ps.setTimestamp(6, new Timestamp(System.currentTimeMillis()));
ps.setString(7, item.get(6));
ps.setTimestamp(8, new Timestamp(System.currentTimeMillis()));
ps.setString(9, item.get(8));
ps.addBatch();
}
ps.executeBatch();
conn.commit();
long endTime = System.currentTimeMillis();
System.out.println(dataList.size() + "条,结束导入到数据库时间:" + endTime + "ms");
System.out.println(dataList.size() + "条,导入用时:" + (endTime - startTime) + "ms");
result.put("success", "1111");
} catch (Exception e) {
result.put("exception", "0000");
e.printStackTrace();
} finally {
JDBCDruidUtils.close(conn, ps);
}
return result;
}Performance results:
Reading 3 million rows with EasyExcel took about 82.9 seconds.
Inserting the same amount using JDBC batch + transaction took about 8.2 seconds.
Log excerpt (truncated):
------开始读取Excel的Sheet时间(包括导入数据过程):1623127873630ms------
200000条,开始导入到数据库时间:1623127880632ms
200000条,结束导入到数据库时间:1623127881513ms
200000条,导入用时:881ms
... (repeated for each batch) ...
------结束读取Excel的Sheet时间(包括导入数据过程):1623127964725ms------6. Conclusion
Using EasyExcel together with batch database operations and proper sheet management makes it feasible to import and export hundreds of millions of rows within acceptable time frames, providing a practical solution for large‑scale data handling in Java backend systems.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.