Mastering Efficient File Downloads and Large-Scale Excel Export in Node.js
This article explains how to implement simple and streaming file downloads with Node.js and Koa, handle content disposition and progress display, support resumable downloads, and efficiently generate large Excel reports using ExcelJS with chunked queries and streaming to overcome memory bottlenecks.
Introduction
Recently I optimized the performance bottleneck of a dynamic table file download interface written by newcomers, and I found it necessary to document the solution to help others write more professional code.
HTTP File Download
Before diving into specific problems, it is useful to understand some HTTP basics and see how to implement file download with Node.js and Koa.
Simple Download
The simplest case is when the file already exists on the server; the client request can read the file and send it back directly.
import Koa from 'koa';
import Router from 'koa-router';
import * as fs from 'fs/promises';
const app = new Koa();
const router = new Router();
router.get('/download/simple', async (ctx) => {
const file = await fs.readFile(`${__dirname}/1.txt`, 'utf-8');
ctx.set({
'Content-Disposition': `attachment; filename=1.txt`,
});
ctx.body = file;
});
app.use(router.routes());
app.listen(80);Setting the Content-Disposition header to attachment tells the browser to download the file.
Streaming Download
When dealing with large files, reading the whole file into memory is not feasible, so a stream is used.
router.get('/download/stream', async (ctx) => {
const file = fs.createReadStream(`${__dirname}/1.txt`);
ctx.set({
'Content-Disposition': `attachment; filename=1.txt`,
});
ctx.body = file;
});If the Content-Disposition header is omitted, the download still occurs because the Content-Type is set to application/octet-stream, which browsers treat as a binary stream.
Progress Display
When the file is very large, correctly setting Content-Length allows the browser to show a progress bar. The following example simulates progress by sending the file in four chunks with a 3‑second interval.
router.get('/download/progress', async (ctx) => {
const { enable } = ctx.query;
const buffer = await fsp.readFile(`${__dirname}/1.txt`);
const stream = new PassThrough();
const l = buffer.length;
const count = 4;
const size = Math.floor(l / count);
const writeQuarter = (i = 0) => {
const start = i * size;
const end = i === count - 1 ? l : (i + 1) * size;
stream.write(buffer.slice(start, end));
if (end === l) {
stream.end();
} else {
setTimeout(() => writeQuarter(i + 1), 3000);
}
};
if (!!enable) {
ctx.set({
'Content-Length': `${l}`,
});
}
ctx.set({
'Content-Type': 'plain/txt',
'Content-Disposition': `attachment; filename=1.txt`,
Connection: 'keep-alive',
});
ctx.body = stream;
writeQuarter();
});This uses a PassThrough stream instead of fs.createReadStream, so Koa does not know the file size or type, and the file is divided into four parts, each sent after a 3‑second pause to simulate a large download.
When the enable query parameter is true, Content-Length is set, allowing the browser to display remaining time; otherwise the progress bar is not shown.
Resumable Download
Large files may fail to download due to unstable networks. Supporting HTTP range requests enables resumable downloads.
function getStartPos(range = '') {
var startPos = 0;
if (typeof range === 'string') {
var matches = /^bytes=([0-9]+)-$/.exec(range);
if (matches) {
startPos = Number(matches[1]);
}
}
return startPos;
}
router.get('/download/partial', async (ctx) => {
const range = ctx.get('range');
const start = getStartPos(range);
const stat = await fsp.stat(`${__dirname}/1.txt`);
const stream = fs.createReadStream(`${__dirname}/1.txt`, {
start,
highWaterMark: Math.ceil((stat.size - start) / 4),
});
stream.on('data', (chunk) => {
console.log(`Readed ${chunk.length} bytes of data.`);
stream.pause();
setTimeout(() => {
stream.resume();
}, 3000);
});
console.log(`Start Pos: ${start}.`);
if (start === 0) {
ctx.status = 200;
ctx.set({
'Accept-Ranges': 'bytes',
'Content-Length': `${stat.size}`,
});
} else {
ctx.status = 206;
ctx.set({
'Content-Range': `bytes ${start}-${stat.size - 1}/${stat.size}`,
});
}
ctx.set({
'Content-Type': 'application/octet-stream',
'Content-Disposition': `attachment; filename=1.txt`,
Connection: 'keep-alive',
});
ctx.body = stream;
});curl -v http://127.0.0.1/download/partial -o 1.txt
If the server is stopped during transfer, the partially downloaded file may contain only a few bytes.
Resuming the download with the -C - flag continues from the last byte.
curl -v http://127.0.0.1/download/partial -o 1.txt -C -
Dynamic Table Export
After covering file download basics, we look at a real problem: reading all records of a database table based on request parameters and exporting them as a spreadsheet.
Bottleneck
// Controller.js
const sequelize = new Sequelize(name, user, password, {
dialect: 'mysql',
host,
port,
});
const model = sequelize.import('/path/to/model');
const { rows } = await model.findAndCountAll({
where: conditions,
attributes: ['f_user_id'],
group: 'f_user_id',
});
const list = await Promise.all(
rows.map((item) => {
const { f_user_id } = item;
const userRows = await model.findAll({
where: { ...conditions, f_user_id },
// ordering, eager loading, ...
});
// formating userRows -> userData
return userData;
})
);
const headers = ['ID', /* ... */];
const sheetData = [headers, ...list];
ctx.attachment(`${sheetName}.xlsx`);
ctx.body = await exportXlsx(sheetName, sheetData); const ExcelJS = require('exceljs');
const fs = require('fs');
module.exports = {
exportXlsx: async (name = 'sheet', data) => {
const tempFilePath = `./xlsx/${Date.now()}.xlsx`;
const workbook = new ExcelJS.stream.xlsx.WorkbookWriter({ filename: tempFilePath });
const sheet = workbook.addWorksheet('My Sheet');
const { length } = data;
for (let i = 0; i < length; i++) {
sheet.addRow(data[i]);
}
sheet.commit();
await workbook.commit();
return fs.createReadStream(tempFilePath);
},
};Typical early‑stage implementations ignore data volume, leading to memory exhaustion when the record count exceeds 20 000, or they use exceljs without its streaming API, and they issue unoptimized ORM queries.
Without considering data size, memory runs out above ~20 k rows.
Using exceljs without its stream API.
Query logic does not consider SQL concurrency.
Optimization
Chunked Processing
The simplest strategy is to split millions of rows into batches of 10 k and process them sequentially or in parallel.
let total = await model.count(/* ... */);
let page = 0;
const tasks = [];
const size = 10000;
while (total > 0) {
tasks.push(() => queryModel({
limit: size,
offset: size * page,
}));
page++;
total -= size;
}
await async.series(tasks);Reduce SQL Query Count
Instead of first grouping by f_user_id and then querying each user separately, use a single IN clause.
model.findAll({
where: {
...conditions,
f_user_id: rows.map(x => `${x.f_user_id}`),
},
});Streaming Processing
The exceljs library provides a streaming writer that can output the workbook directly as a response stream.
const workbook = new Excel.stream.xlsx.WorkbookWriter(options);
const sheet = workbook.addWorksheet('My Sheet');
// ...
ctx.body = workbook.stream;More
Other techniques such as offline generation, caching, and zip‑stream handling of the XLSX format can further improve performance.
Conclusion
File export is a common requirement; implementing it correctly reflects professional competence.
Details such as the ZIP‑stream handling inside ExcelJS are beyond the scope of this article and can be explored separately.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tencent IMWeb Frontend Team
IMWeb Frontend Community gathering frontend development enthusiasts. Follow us for refined live courses by top experts, cutting‑edge technical posts, and to sharpen your frontend skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
