Efficient Import and Export of Millions of Records Using Apache POI and EasyExcel
This article explains how to handle massive Excel import and export tasks in Java by comparing traditional POI implementations, selecting the appropriate Workbook type based on data volume, and leveraging Alibaba's EasyExcel library together with batch JDBC operations to process over three million rows with minimal memory usage and high performance.
Prospects
In many projects we need to import data from Excel into a database and export data from a database to Excel. The author encountered a situation where millions of rows had to be processed, and decided to solve the problem rather than avoid it.
1. Comparison of Traditional POI Versions
Apache POI provides three main Workbook implementations:
HSSFWorkbook : works with the old .xls format (Excel 2003 and earlier) and is limited to 65,535 rows. It does not cause memory overflow for small datasets.
XSSFWorkbook : supports .xlsx files (Excel 2007‑2010) and can handle up to 1,048,576 rows, but keeps all data in memory, leading to possible OutOfMemory errors.
SXSSFWorkbook : introduced in POI 3.8, writes data to temporary files to keep memory usage low and can handle very large files, but it does not support some features such as sheet cloning, formula evaluation, or modifying the header after writing.
When to Use Which
If the data volume is below 70,000 rows, HSSFWorkbook or XSSFWorkbook is sufficient.
If the volume exceeds 70,000 rows and no Excel styling or formulas are required, use SXSSFWorkbook .
If the volume exceeds 70,000 rows and you need to manipulate headers, styles, or formulas, use XSSFWorkbook together with batch writing.
2. Million‑Row Import/Export Solution
The author chose Alibaba's EasyExcel to simplify the implementation. The solution consists of three parts:
Batch query the database (e.g., 200,000 rows per batch) to avoid loading the whole result set into memory.
Write each batch to a separate sheet (each sheet holds up to 1,000,000 rows).
For import, read the Excel file in batches (e.g., 200,000 rows) and insert the data into the database using JDBC batch statements inside a manual transaction.
EasyExcel Export Example
public void dataExport300w(HttpServletResponse response) {
OutputStream outputStream = null;
try {
long startTime = System.currentTimeMillis();
outputStream = response.getOutputStream();
ExcelWriter writer = new ExcelWriter(outputStream, ExcelTypeEnum.XLSX);
String fileName = new String("excel100w".getBytes(), "UTF-8");
Table table = new Table(1);
List
> titles = new ArrayList<>();
titles.add(Arrays.asList("onlineseqid"));
titles.add(Arrays.asList("businessid"));
// ... other column titles omitted for brevity ...
table.setHead(titles);
int count = 3000001;
Integer totalCount = actResultLogMapper.findActResultLogByCondations(count);
Integer sheetDataRows = ExcelConstants.PER_SHEET_ROW_COUNT; // 1,000,000
Integer writeDataRows = ExcelConstants.PER_WRITE_ROW_COUNT; // 200,000
Integer sheetNum = totalCount % sheetDataRows == 0 ? totalCount / sheetDataRows : totalCount / sheetDataRows + 1;
Integer oneSheetWriteCount = sheetDataRows / writeDataRows;
Integer lastSheetWriteCount = totalCount % sheetDataRows == 0 ? oneSheetWriteCount : (totalCount % sheetDataRows % writeDataRows == 0 ? (totalCount / sheetDataRows / writeDataRows) : (totalCount / sheetDataRows / writeDataRows + 1));
List
> dataList = new ArrayList<>();
for (int i = 0; i < sheetNum; i++) {
Sheet sheet = new Sheet(i, 0);
sheet.setSheetName("测试Sheet1" + i);
for (int j = 0; j < (i != sheetNum - 1 ? oneSheetWriteCount : lastSheetWriteCount); j++) {
dataList.clear();
PageHelper.startPage(j + 1 + oneSheetWriteCount * i, writeDataRows);
List
reslultList = actResultLogMapper.findByPage100w();
if (!CollectionUtils.isEmpty(reslultList)) {
reslultList.forEach(item -> dataList.add(Arrays.asList(item.getOnlineseqid(), item.getBusinessid(), item.getBecifno(), item.getIvisresult(), item.getCreatedby(), Calendar.getInstance().getTime().toString(), item.getUpdateby(), Calendar.getInstance().getTime().toString(), item.getRisklevel())));
}
writer.write0(dataList, sheet, table);
}
}
response.setHeader("Content-Disposition", "attachment;filename=" + new String((fileName).getBytes("gb2312"), "ISO-8859-1") + ".xlsx");
response.setContentType("multipart/form-data");
response.setCharacterEncoding("utf-8");
writer.finish();
outputStream.flush();
long endTime = System.currentTimeMillis();
System.out.println("Export time:" + (endTime - startTime) / 1000 + " seconds");
} catch (Exception e) {
e.printStackTrace();
} finally {
if (outputStream != null) {
try { outputStream.close(); } catch (Exception e) { e.printStackTrace(); }
}
}
}EasyExcel Import Example
public void import2DBFromExcel10w(List
> dataList) {
if (dataList.isEmpty()) return;
Connection conn = null;
PreparedStatement ps = null;
try {
long start = System.currentTimeMillis();
conn = JDBCDruidUtils.getConnection();
conn.setAutoCommit(false);
String sql = "insert into ACT_RESULT_LOG (onlineseqid,businessid,becifno,ivisresult,createdby,createddate,updateby,updateddate,risklevel) values (?,?,?,?,?,?,?,?,?)";
ps = conn.prepareStatement(sql);
for (Map
item : dataList) {
ps.setString(1, item.get(0));
ps.setString(2, item.get(1));
ps.setString(3, item.get(2));
ps.setString(4, item.get(3));
ps.setString(5, item.get(4));
ps.setTimestamp(6, new Timestamp(System.currentTimeMillis()));
ps.setString(7, item.get(6));
ps.setTimestamp(8, new Timestamp(System.currentTimeMillis()));
ps.setString(9, item.get(8));
ps.addBatch();
}
ps.executeBatch();
conn.commit();
long end = System.currentTimeMillis();
System.out.println("Import time:" + (end - start) + " ms");
} catch (Exception e) {
e.printStackTrace();
} finally {
JDBCDruidUtils.close(conn, ps);
}
}Test Results
Exporting 3,000,000 rows to Excel took about 2 minutes 15 seconds using EasyExcel with SXSSF‑style streaming. Importing the same amount of data with batch JDBC and manual transaction took roughly 8.2 seconds, while reading the Excel file itself required about 82.9 seconds.
Conclusion
The combination of EasyExcel for streaming write/read and JDBC batch operations for database interaction provides a practical solution for handling hundred‑million‑scale data import/export in Java back‑ends, keeping memory consumption low and performance acceptable.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.