Mastering Efficient File Downloads and Large-Scale Excel Export in Node.js
This article explains how to implement simple and streaming file downloads with Node.js and Koa, handle content disposition and progress display, support resumable downloads, and efficiently generate large Excel reports using ExcelJS with chunked queries and streaming to overcome memory bottlenecks.
Introduction
Recently I optimized the performance bottleneck of a dynamic table file download interface written by newcomers, and I found it necessary to document the solution to help others write more professional code.
HTTP File Download
Before diving into specific problems, it is useful to understand some HTTP basics and see how to implement file download with Node.js and Koa.
Simple Download
The simplest case is when the file already exists on the server; the client request can read the file and send it back directly.
<code>import Koa from 'koa';
import Router from 'koa-router';
import * as fs from 'fs/promises';
const app = new Koa();
const router = new Router();
router.get('/download/simple', async (ctx) => {
const file = await fs.readFile(`${__dirname}/1.txt`, 'utf-8');
ctx.set({
'Content-Disposition': `attachment; filename=1.txt`,
});
ctx.body = file;
});
app.use(router.routes());
app.listen(80);
</code>Setting the
Content-Dispositionheader to
attachmenttells the browser to download the file.
Streaming Download
When dealing with large files, reading the whole file into memory is not feasible, so a stream is used.
<code>router.get('/download/stream', async (ctx) => {
const file = fs.createReadStream(`${__dirname}/1.txt`);
ctx.set({
'Content-Disposition': `attachment; filename=1.txt`,
});
ctx.body = file;
});
</code>If the
Content-Dispositionheader is omitted, the download still occurs because the
Content-Typeis set to
application/octet-stream, which browsers treat as a binary stream.
Progress Display
When the file is very large, correctly setting
Content-Lengthallows the browser to show a progress bar. The following example simulates progress by sending the file in four chunks with a 3‑second interval.
<code>router.get('/download/progress', async (ctx) => {
const { enable } = ctx.query;
const buffer = await fsp.readFile(`${__dirname}/1.txt`);
const stream = new PassThrough();
const l = buffer.length;
const count = 4;
const size = Math.floor(l / count);
const writeQuarter = (i = 0) => {
const start = i * size;
const end = i === count - 1 ? l : (i + 1) * size;
stream.write(buffer.slice(start, end));
if (end === l) {
stream.end();
} else {
setTimeout(() => writeQuarter(i + 1), 3000);
}
};
if (!!enable) {
ctx.set({
'Content-Length': `${l}`,
});
}
ctx.set({
'Content-Type': 'plain/txt',
'Content-Disposition': `attachment; filename=1.txt`,
Connection: 'keep-alive',
});
ctx.body = stream;
writeQuarter();
});
</code>This uses a
PassThroughstream instead of
fs.createReadStream, so Koa does not know the file size or type, and the file is divided into four parts, each sent after a 3‑second pause to simulate a large download.
When the
enablequery parameter is true,
Content-Lengthis set, allowing the browser to display remaining time; otherwise the progress bar is not shown.
Resumable Download
Large files may fail to download due to unstable networks. Supporting HTTP range requests enables resumable downloads.
<code>function getStartPos(range = '') {
var startPos = 0;
if (typeof range === 'string') {
var matches = /^bytes=([0-9]+)-$/.exec(range);
if (matches) {
startPos = Number(matches[1]);
}
}
return startPos;
}
router.get('/download/partial', async (ctx) => {
const range = ctx.get('range');
const start = getStartPos(range);
const stat = await fsp.stat(`${__dirname}/1.txt`);
const stream = fs.createReadStream(`${__dirname}/1.txt`, {
start,
highWaterMark: Math.ceil((stat.size - start) / 4),
});
stream.on('data', (chunk) => {
console.log(`Readed ${chunk.length} bytes of data.`);
stream.pause();
setTimeout(() => {
stream.resume();
}, 3000);
});
console.log(`Start Pos: ${start}.`);
if (start === 0) {
ctx.status = 200;
ctx.set({
'Accept-Ranges': 'bytes',
'Content-Length': `${stat.size}`,
});
} else {
ctx.status = 206;
ctx.set({
'Content-Range': `bytes ${start}-${stat.size - 1}/${stat.size}`,
});
}
ctx.set({
'Content-Type': 'application/octet-stream',
'Content-Disposition': `attachment; filename=1.txt`,
Connection: 'keep-alive',
});
ctx.body = stream;
});
</code>curl -v http://127.0.0.1/download/partial -o 1.txt
If the server is stopped during transfer, the partially downloaded file may contain only a few bytes.
Resuming the download with the
-C -flag continues from the last byte.
curl -v http://127.0.0.1/download/partial -o 1.txt -C -
Dynamic Table Export
After covering file download basics, we look at a real problem: reading all records of a database table based on request parameters and exporting them as a spreadsheet.
Bottleneck
<code>// Controller.js
const sequelize = new Sequelize(name, user, password, {
dialect: 'mysql',
host,
port,
});
const model = sequelize.import('/path/to/model');
const { rows } = await model.findAndCountAll({
where: conditions,
attributes: ['f_user_id'],
group: 'f_user_id',
});
const list = await Promise.all(
rows.map((item) => {
const { f_user_id } = item;
const userRows = await model.findAll({
where: { ...conditions, f_user_id },
// ordering, eager loading, ...
});
// formating userRows -> userData
return userData;
})
);
const headers = ['ID', /* ... */];
const sheetData = [headers, ...list];
ctx.attachment(`${sheetName}.xlsx`);
ctx.body = await exportXlsx(sheetName, sheetData);
</code> <code>const ExcelJS = require('exceljs');
const fs = require('fs');
module.exports = {
exportXlsx: async (name = 'sheet', data) => {
const tempFilePath = `./xlsx/${Date.now()}.xlsx`;
const workbook = new ExcelJS.stream.xlsx.WorkbookWriter({ filename: tempFilePath });
const sheet = workbook.addWorksheet('My Sheet');
const { length } = data;
for (let i = 0; i < length; i++) {
sheet.addRow(data[i]);
}
sheet.commit();
await workbook.commit();
return fs.createReadStream(tempFilePath);
},
};
</code>Typical early‑stage implementations ignore data volume, leading to memory exhaustion when the record count exceeds 20 000, or they use
exceljswithout its streaming API, and they issue unoptimized ORM queries.
Without considering data size, memory runs out above ~20 k rows.
Using
exceljswithout its stream API.
Query logic does not consider SQL concurrency.
Optimization
Chunked Processing
The simplest strategy is to split millions of rows into batches of 10 k and process them sequentially or in parallel.
<code>let total = await model.count(/* ... */);
let page = 0;
const tasks = [];
const size = 10000;
while (total > 0) {
tasks.push(() => queryModel({
limit: size,
offset: size * page,
}));
page++;
total -= size;
}
await async.series(tasks);
</code>Reduce SQL Query Count
Instead of first grouping by
f_user_idand then querying each user separately, use a single
INclause.
<code>model.findAll({
where: {
...conditions,
f_user_id: rows.map(x => `${x.f_user_id}`),
},
});
</code>Streaming Processing
The
exceljslibrary provides a streaming writer that can output the workbook directly as a response stream.
<code>const workbook = new Excel.stream.xlsx.WorkbookWriter(options);
const sheet = workbook.addWorksheet('My Sheet');
// ...
ctx.body = workbook.stream;
</code>More
Other techniques such as offline generation, caching, and zip‑stream handling of the XLSX format can further improve performance.
Conclusion
File export is a common requirement; implementing it correctly reflects professional competence.
Details such as the ZIP‑stream handling inside ExcelJS are beyond the scope of this article and can be explored separately.
Tencent IMWeb Frontend Team
IMWeb Frontend Community gathering frontend development enthusiasts. Follow us for refined live courses by top experts, cutting‑edge technical posts, and to sharpen your frontend skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.