Frontend Development 13 min read

Large File Chunking and Web Worker Optimization in JavaScript

This article demonstrates how to split large files into 5 MB chunks, compute MD5 hashes, and accelerate processing with Web Workers by dynamically allocating threads based on the browser's hardware concurrency, achieving up to ten‑fold speed improvements.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Large File Chunking and Web Worker Optimization in JavaScript

Large File Chunking Hello everyone, I am sharing a tutorial on large‑file chunking combined with Web Workers. By writing this article I also learned how to obtain the number of CPU threads via JavaScript, so let’s dive straight in. 1. Initialization, Setting Up the Scaffold <!DOCTYPE html> 大文件分片 2. Reading a Single File Chunk and Encrypting It We use SparkMD5 for hashing; you can download and import it yourself. import { createChunks } from './createChunks'; // Define the size of each slice (5 MB) const CHUNK_SIZE = 1024 * 1024 * 5; export async function cutFile(file) { // Generate each slice – slicing is time‑consuming, so it is asynchronous const chunk = await createChunks(file, 1, CHUNK_SIZE); // After slicing completes, we can access the slice information console.log(chunk); } The cutFile function creates a slice and returns its start, end, index, and hash value. 3. Splitting the Entire File into Chunks After calculating the total number of slices, we loop through them and store each result in an array. export async function cutFile(file) { const result = []; // Calculate total number of slices const chunks = Math.ceil(file.size / CHUNK_SIZE); // Generate each slice asynchronously for (let i = 0; i < chunks; i++) { const chunk = await createChunks(file, i, CHUNK_SIZE); result.push(chunk); } return result; } The result shows that a 500 MB file was divided into 103 slices, taking about 2.3 seconds. 4. Analyzing Optimization Opportunities When uploading multi‑gigabyte files, MD5 hashing becomes a bottleneck, causing long thread blockage. The key to optimization is to avoid blocking the main thread, which can be achieved with Web Workers. According to MDN, a Web Worker runs scripts in a background thread separate from the main UI thread, allowing heavy computations without freezing the interface. 5. Optimizing with Web Workers 1. Setting Up Workers First, define the number of worker threads and create them. // Define thread count const THREAD_COUNT = 4; // 4 workers export async function cutFile(file) { const result = []; const chunks = Math.ceil(file.size / CHUNK_SIZE); const workerChunkCount = Math.ceil(chunks / THREAD_COUNT); for (let i = 0; i < THREAD_COUNT; i++) { const worker = new Worker('./worker.js', { type: 'module' }); const startIndex = i * workerChunkCount; let endIndex = startIndex + workerChunkCount; if (endIndex > chunks) endIndex = chunks; worker.postMessage({ file, CHUNK_SIZE, startIndex, endIndex }); worker.onmessage = e => { // Process returned chunks for (let j = startIndex; j < endIndex; j++) { result[j] = e.data[j - startIndex]; } worker.terminate(); // Resolve when all workers finish (handled later) }; } // Promise resolution omitted for brevity } 2. Calculating the Number of Slices per Worker Each worker processes Math.ceil(totalChunks / THREAD_COUNT) slices, with start and end indices computed from the loop index. 3. Receiving Messages from Workers let finishCount = 0; // Track completed workers worker.onmessage = e => { for (let i = startIndex; i < endIndex; i++) { result[i] = e.data[i - startIndex]; } worker.terminate(); finishCount++; if (finishCount === THREAD_COUNT) { resolve(result); } }; 4. Worker Script (worker.js) The worker receives the file, chunk size, and slice range, then creates all required chunks in parallel using Promise.all . import { createChunks } from "./createChunks.js"; onmessage = async e => { const arr = []; const { file, CHUNK_SIZE, startIndex, endIndex } = e.data; for (let i = startIndex; i < endIndex; i++) { arr.push(createChunks(file, i, CHUNK_SIZE)); } const chunks = await Promise.all(arr); postMessage(chunks); }; Running the optimized version reduces processing time from over 2 seconds to about 0.2 seconds—a ten‑fold speedup. 6. Obtaining the Number of CPU Threads in JavaScript We can query navigator.hardwareConcurrency to get the maximum number of logical processors, falling back to 4 if unavailable, and adjust THREAD_COUNT accordingly. // Get the number of logical CPU threads const THREAD_COUNT = navigator.hardwareConcurrency || 4; console.log('CPU threads:', navigator.hardwareConcurrency); Using the actual hardware concurrency further halves the processing time. Source code repository: https://gitee.com/tcwty123/large-file-sharding

Performance OptimizationJavaScriptfrontend developmentWeb Workersfile chunking
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.