Export Millions of Records to Excel Efficiently with SpringBoot, EasyExcel, and Async
This article explains how to build a high‑performance, million‑row Excel export feature in Java using SpringBoot, EasyExcel, asynchronous job or MQ processing, pagination, multi‑sheet handling, OSS storage, and WebSocket notifications to avoid timeouts, OOM errors, and improve user experience.
Introduction
Recently I implemented a MySQL million‑level Excel export feature that is now live. The function allows users to click a "Export All" button on the UI and download all product data, but exporting hundreds of thousands or millions of rows presents many challenges.
Synchronously exporting data can cause interface timeouts.
Loading all data into memory may trigger OOM.
Large SQL queries become very slow.
Records with the same product code need to be grouped together.
When using asynchronous processing, the user must be notified of the export result.
If the generated Excel file is too large, the user may be unable to open it.
1. Asynchronous Processing
Because synchronous export easily times out, the first design choice is to use asynchronous processing. Common async approaches include using a thread, thread pool, job, or message queue (MQ).
1.1 Using Job
A job table records each export task. When the user clicks the export button, the backend creates a record with status Pending. A scheduled job scans the table every few minutes, picks up pending tasks, updates the status to Running, performs the export, and finally sets the status to Completed or Failed. Adding a Running status prevents duplicate execution when a job takes longer than its interval.
1.2 Using MQ
Alternatively, the export request can send a message to an MQ server. A dedicated consumer processes the message and generates the Excel file. MQ provides better real‑time performance and built‑in retry and dead‑letter mechanisms (e.g., RocketMQ).
2. Using EasyExcel
Traditional frameworks like Apache POI or JXL consume a lot of memory when handling large files. EasyExcel rewrites POI’s 07‑format parser, loading data row by row from disk, reducing memory usage from ~100 MB to a few megabytes for a 3 MB Excel file.
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>easyexcel</artifactId>
<version>3.0.2</version>
</dependency>Reading data:
@Test
public void simpleRead() {
String fileName = TestFileUtil.getPath() + "demo" + File.separator + "demo.xlsx";
// Specify the data class and listener, then read the first sheet
EasyExcel.read(fileName, DemoData.class, new DemoDataListener())
.sheet()
.doRead();
}Writing data:
@Test
public void simpleWrite() {
String fileName = TestFileUtil.getPath() + "write" + System.currentTimeMillis() + ".xlsx";
// Write data to the first sheet named "Template"
EasyExcel.write(fileName, DemoData.class)
.sheet("Template")
.doWrite(data());
}EasyExcel avoids loading the entire file into memory by streaming rows from disk.
3. Pagination Query
Fetching all rows at once is time‑consuming and may cause OOM. Instead, use pagination (e.g., 5,000 rows per page) and MyBatis limit statements: limit #{pageStart}, #{pageSize} Calculate pageStart as (pageNo - 1) * pageSize. The service returns a Page<User> object containing the current list and total count.
4. Multiple Sheets
Excel limits a single sheet to 1,048,576 rows. For data exceeding this, split the output across multiple sheets.
5. Calculating Limit Offsets for Multiple Sheets
When exporting to several sheets, compute the start and end page numbers for each sheet and query the appropriate page range:
ExcelWriter excelWriter = EasyExcelFactory.write(out).build();
int totalPage = searchUserTotalPage(searchModel);
if (totalPage > 0) {
Page<User> page = Page.create(searchModel);
int sheetCount = (totalPage % maxSheetCount == 0) ? totalPage / maxSheetCount : (totalPage / maxSheetCount) + 1;
for (int i = 0; i < sheetCount; i++) {
WriterSheet writeSheet = buildSheet(i, "sheet" + i);
int startPageNo = i * (maxSheetCount / pageSize) + 1;
int endPageNo = (i + 1) * (maxSheetCount / pageSize);
while (page.getPageNo() >= startPageNo && page.getPageNo() <= endPageNo) {
page = searchUser(searchModel);
if (CollectionUtils.isEmpty(page.getList())) {
break;
}
excelWriter.write(page.getList(), writeSheet);
page.setPageNo(page.getPageNo() + 1);
}
}
}6. Uploading the File to OSS
Since the export runs asynchronously, the generated Excel file is first uploaded to an Object Storage Service (OSS). The OSS response provides the file name and access URL, which are stored in a database table. Users can later download the file via the URL, avoiding large files being kept on the application server.
7. Real‑Time Notification via WebSocket
After the file is uploaded, a WebSocket message pushes a notification to the user’s browser, indicating success or failure and providing a download link. The notification record includes title, user, attachment URL, read status, and type, enabling traceability.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-websocket</artifactId>
</dependency>8. Configurable Total Record Count
The total number of records to export and the page size should be configurable. This allows administrators to limit the export size (e.g., 500,000 rows) to ensure the resulting Excel file remains usable.
9. Ordering by Product Code
To keep rows with the same product code together, add an ORDER BY product_code clause in the SQL. When pagination splits the data, ensure the last incomplete product group is removed to avoid partial records.
Conclusion
Both job‑based and MQ‑based asynchronous export approaches are viable; choose based on latency requirements and system characteristics.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
