How a Massive Excel Import Triggered OOM in Our MQ Consumer—and the Fix
This article walks through a real‑world OOM incident in an MQ consumer caused by large Excel import/export, explains how memory dumps and Prometheus logs pinpointed the culprit, and shows how switching from XSSFWorkbook to SXSSFWorkbook and tuning the thread pool resolved the issue.
Introduction
Hello everyone, I’m Su San and I’d like to share an interesting OOM problem we encountered in an online service.
1. Incident Scene
Our MQ consumer service suffered an OOM crash that brought the service down. We received many memory alarm emails, and operations quickly dumped the memory snapshot and restarted the service to restore availability.
2. Initial Investigation
The dumped memory snapshot was over 3 GB, too large to transfer internally, so we turned to log files. Prometheus metrics showed a sudden memory spike around 2022-09-26 14:16:29. By focusing on logs within five seconds of that timestamp, we quickly identified the Excel import/export feature as the source of the spike.
The workflow is illustrated below:
User uploads an Excel file via the browser, invoking the upload API.
The API stores the file on a file server and sends the file URL to MQ.
The MQ consumer retrieves the Excel data, processes it, and writes results to a new Excel file.
The new Excel is uploaded back to the file server, and a WebSocket message notifies the user.
The user downloads the new Excel.
3. Unable to Open the Dump File
Our colleague tried to open the 3 GB dump with MAT (Memory Analyzer Tool) but failed. After increasing the JVM max heap to 4096 m, the file opened, revealing a JDK version mismatch: the MAT tool was built for SunJDK while our production environment runs OpenJDK.
We needed an OpenJDK‑compatible MAT build.
4. Further Analysis
Using the compatible MAT, we discovered that org.apache.poi.xssf.usermodel.XSSFSheet objects consumed the most memory. The Excel handling relies on Apache POI, which provides three Workbook implementations: HSSFWorkbook: Supports .xls (Excel 2003‑) files, limited to 65 535 rows, safe for memory. XSSFWorkbook: Supports .xlsx (Excel 2003‑2007) files, can handle up to 1.04 million rows but creates many objects in memory, leading to OOM. SXSSFWorkbook: Streaming version of XSSFWorkbook; keeps only a configurable number of rows in memory and writes older rows to disk, preventing OOM.
Key code that caused the issue:
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(file));
XSSFSheet sheet = wb.getSheetAt(0);5. How to Solve the Problem
Replacing XSSFWorkbook with SXSSFWorkbook solves the memory overflow:
XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(file));
SXSSFWorkbook swb = new SXSSFWorkbook(wb, 100);
SXSSFSheet sheet = (SXSSFSheet) swb.createSheet("sheet1");The second argument (100) limits the number of rows kept in memory; the rest are flushed to temporary files. After processing, call:
sheet.flushRows();Also remember to close the workbook to release resources.
When using any Workbook implementation, always invoke close() to avoid hidden OOM risks.
6. Further Thoughts
We considered Alibaba’s EasyExcel, but preserving complex styles required the original POI API. To mitigate OOM under high concurrency, we reduced the MQ consumer thread pool to four threads and suggested using Arthas for live OOM diagnosis.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
