Large File Upload with Chunking, Resume, and RandomAccessFile in Java
This article explains how to handle multi‑gigabyte video uploads by splitting files into chunks, using MD5 for identification, implementing resumable and instant uploads with Spring Boot and Redis, and leveraging Java's RandomAccessFile and memory‑mapped I/O for efficient merging.
Appetizer
Hello everyone, today I share another practical article and hope you like it.
Recently I received a new requirement to upload a ~2 GB video file. Trying it with OSS in the test environment took more than ten minutes, and because of company resource constraints I abandoned that solution.
When it comes to large‑file upload, the first thing that comes to mind is various cloud storage services. Users often upload their "small movies" to these services, which usually support chunked upload, resumable upload, and instant upload, reducing the impact of network fluctuations and bandwidth limits and greatly improving user experience.
Here are a few key concepts:
File chunking : split a large file into smaller pieces, upload/download each piece, then reassemble them into the original file.
Resumable upload : upload each chunk in a separate thread; if a network failure occurs, the upload can continue from the already uploaded part without starting over.
Instant upload : if the file already exists on the server, the system returns the file URI directly.
RandomAccessFile
Normally we use FileInputStream , FileOutputStream , FileReader , FileWriter and other IO streams to read files. This article introduces RandomAccessFile .
RandomAccessFile directly extends Object and implements the DataInput and DataOutput interfaces. It supports random reading and writing of files, similar to accessing a large byte array in a file system.
The class is based on a "file pointer" (a cursor or index). The pointer can be read with getFilePointer() and set with seek() .
When reading, bytes are taken from the current file pointer; when writing beyond the current end, the underlying array expands. RandomAccessFile offers four access modes:
r : read‑only; write operations throw IOException .
rw : read/write; creates the file if it does not exist.
rws : read/write; forces every update of file content or metadata to be written to the storage device.
rwd : read/write; forces only file content updates to be written to the storage device.
In rw mode the data is first written to a buffer; it is flushed to the file only when the buffer is full or when RandomAccessFile.close() is called.
API
1. void seek(long pos) : set the file pointer offset for the next read/write operation.
Setting the offset beyond the current file length allows the file to be extended on subsequent writes.
2. native long getFilePointer() : return the current cursor position.
3. native long length() : return the current file length.
4. Read methods (illustrated with images in the original article).
5. Write methods (illustrated with images in the original article).
6. readFully(byte[] b) : fill the provided buffer; blocks until the buffer is full or throws an exception at end‑of‑stream.
7. FileChannel getChannel() : obtain the unique FileChannel associated with this file.
8. int skipBytes(int n) : skip over n bytes of input.
Most of RandomAccessFile 's functionality has been superseded by JDK 1.4 NIO's memory‑mapped file support, which maps a file into memory and avoids frequent disk I/O.
Main Course
Previously I pasted too much source code, affecting readability. From now on I will only show key code snippets; the full source can be obtained from the "backend".
File Chunking
Chunking is performed on the front end, using powerful JavaScript libraries or ready‑made components. The chunk size and number are determined, and each chunk receives an index.
To avoid mixing chunks of different files, the file's MD5 value is used for identification and for checking whether the file already exists on the server.
If the file exists, return its URL directly.
If the file does not exist but some chunks have been uploaded, return the list of missing chunk indexes.
If the file does not exist and no chunks have been uploaded, all chunks need to be uploaded.
fileRederInstance.readAsBinaryString(file);
fileRederInstance.addEventListener("load", (e) => {
let fileBolb = e.target.result;
fileMD5 = md5(fileBolb);
const formData = new FormData();
formData.append("md5", fileMD5);
axios
.post(http + "/fileUpload/checkFileMd5", formData)
.then((res) => {
if (res.data.message == "文件已存在") {
// file already exists, return its URL
success && success(res);
} else {
// file not uploaded before; res.data may be null (no upload) or an array of missing chunks
if (!res.data.data) {
// missing chunks, resume upload
chunkArr = res.data.data;
}
readChunkMD5();
}
})
.catch((e) => {});
});Before calling the upload API, the slice method is used to extract the chunk corresponding to the current index.
const getChunkInfo = (file, currentChunk, chunkSize) => {
// get the file segment for the given index
let start = currentChunk * chunkSize;
let end = Math.min(file.size, start + chunkSize);
// extract the chunk
let chunk = file.slice(start, end);
return { start, end, chunk };
};After obtaining the chunk, the upload API is called to complete the upload.
Resumable Upload & Instant Upload
The backend is built with spring boot and uses redis to store the upload status and file path.
If the file is fully uploaded, the API returns the file path; if partially uploaded, it returns the list of missing chunks; if never uploaded, it returns a prompt.
During chunk upload two files are created: the main file and a temporary file. The temporary file acts like an array where each chunk is assigned a byte value of 127.
Two values are used when checking MD5:
Upload status: true if the file has been fully uploaded, false if only partially uploaded.
File address: the final file path if upload is complete, otherwise the temporary file path.
/**
* Check file MD5
*/
public Result checkFileMd5(String md5) throws IOException {
// file upload status: exists if the file has ever been uploaded
Object processingObj = stringRedisTemplate.opsForHash().get(UploadConstants.FILE_UPLOAD_STATUS, md5);
if (processingObj == null) {
return Result.ok("该文件没有上传过");
}
boolean processing = Boolean.parseBoolean(processingObj.toString());
// file path (full file or temporary file)
String value = stringRedisTemplate.opsForValue().get(UploadConstants.FILE_MD5_KEY + md5);
if (processing) {
return Result.ok(value, "文件已存在");
} else {
File confFile = new File(value);
byte[] completeList = FileUtils.readFileToByteArray(confFile);
List
missChunkList = new LinkedList<>();
for (int i = 0; i < completeList.length; i++) {
if (completeList[i] != Byte.MAX_VALUE) {
// fill the gap
missChunkList.add(i);
}
}
return Result.ok(missChunkList, "该文件上传了一部分");
}
}After all chunks are uploaded, the file needs to be merged.
Chunk Merging
Chunks sharing the same MD5 are merged by inserting each chunk into the correct position, similar to inserting into an array. The merge is performed using "memory‑mapped" I/O.
// read and write are both allowed
RandomAccessFile tempRaf = new RandomAccessFile(tmpFile, "rw");
// obtain the unique FileChannel
FileChannel fileChannel = tempRaf.getChannel();
// calculate offset: chunk size * chunk index
long offset = CHUNK_SIZE * multipartFileDTO.getChunk();
byte[] fileData = multipartFileDTO.getFile().getBytes();
// map the region directly to memory
MappedByteBuffer mappedByteBuffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, offset, fileData.length);
mappedByteBuffer.put(fileData);
// release
FileMD5Util.freedMappedByteBuffer(mappedByteBuffer);
fileChannel.close();After each chunk upload, the progress is checked to see whether the whole file is complete.
RandomAccessFile accessConfFile = new RandomAccessFile(confFile, "rw");
// mark the current chunk as completed
accessConfFile.setLength(multipartFileDTO.getChunks());
accessConfFile.seek(multipartFileDTO.getChunk());
accessConfFile.write(Byte.MAX_VALUE);
// verify if all chunks are done
byte[] completeList = FileUtils.readFileToByteArray(confFile);
byte isComplete = Byte.MAX_VALUE;
for (int i = 0; i < completeList.length && isComplete == Byte.MAX_VALUE; i++) {
// bitwise AND; if any byte is not MAX_VALUE, the file is not complete
isComplete = (byte) (isComplete & completeList[i]);
}
accessConfFile.close();Finally, the upload progress is updated in Redis.
// update Redis status: true means the whole large file has been uploaded
if (isComplete == Byte.MAX_VALUE) {
stringRedisTemplate.opsForHash().put(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5(), "true");
stringRedisTemplate.opsForValue().set(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5(), uploadDirPath + "/" + fileName);
} else {
if (!stringRedisTemplate.opsForHash().hasKey(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5())) {
stringRedisTemplate.opsForHash().put(UploadConstants.FILE_UPLOAD_STATUS, multipartFileDTO.getMd5(), "false");
}
if (!stringRedisTemplate.hasKey(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5())) {
stringRedisTemplate.opsForValue().set(UploadConstants.FILE_MD5_KEY + multipartFileDTO.getMd5(), uploadDirPath + "/" + fileName + ".conf");
}
}Reply with break to get the full source code!
Hope you find this practical article helpful.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.