Why Your Spring Boot File Upload Fails: Async Reading, Temp Files, and Encoding Gotchas
This article explains common pitfalls when handling file uploads in Spring Boot, including temporary file lifecycle causing FileNotFound errors in asynchronous processing, and character encoding mismatches that lead to garbled text, and provides practical solutions such as size limits, main‑thread parsing, and BOM‑based charset detection.
When working on a project I needed to parse a txt file and store its data, so I quickly wrote code using Spring Boot + Tomcat with MultipartFile to receive the uploaded file.
1. Asynchronous Reading
During parsing I encountered a File Not Found error even though the source code read the file line by line correctly. The issue is not the reading method but the lifecycle of the temporary file created by Spring Boot + Tomcat:
After the request ends, Tomcat immediately deletes the temporary file.
If the file is read in an asynchronous thread after the request has finished, the file no longer exists.
Because I moved the parsing to an async thread to speed up the request, the main thread finished, the temporary file was cleaned up, and the async thread could not find it.
The simple fix is to limit the upload size and complete the file parsing in the main thread, preventing the temporary file from being removed.
2. File Encoding Issues
Even after fixing the async issue, the parsed content appeared as garbled text. This happens when a UTF‑16 (or UTF‑16LE) encoded file is interpreted as UTF‑8, showing "�" characters that represent the null byte \0.
On macOS and CentOS, text files are saved as UTF‑8 by default, but on Windows they are often saved as UTF‑16LE or UTF‑8 with a BOM. Therefore, the file must be read using the same charset as it was saved, explicitly specifying the charset.
The file’s encoding can be detected from its BOM (Byte Order Mark) at the beginning of the file. Common BOM values are:
UTF‑8: EF BB BF UTF‑16LE: FF FE UTF‑16BE: FE FF Windows may add a BOM to UTF‑8 files, so detecting it is essential.
Solution: read the first few bytes (2‑3) of the file, determine the charset from the BOM, and then read the file with the correct charset.
Below is a complete example that reads the BOM, selects the appropriate Charset, and then parses the file safely, handling both synchronous and asynchronous scenarios without FileNotFound or garbled text issues.
//... file is MultipartFile
List<String> lines = null;
try {
InputStream rawInputStream = file.getInputStream();
rawInputStream = new BufferedInputStream(rawInputStream);
rawInputStream.mark(3);
byte[] bom = new byte[3];
int read = rawInputStream.read(bom, 0, bom.length);
rawInputStream.reset();
// Detect BOM
Charset charset;
if (read >= 2 && bom[0] == (byte)0xFF && bom[1] == (byte)0xFE) {
charset = StandardCharsets.UTF_16LE;
} else if (read >= 3 && bom[0] == (byte)0xEF && bom[1] == (byte)0xBB && bom[2] == (byte)0xBF) {
charset = StandardCharsets.UTF_8;
} else {
// default UTF-8 or GBK
charset = StandardCharsets.UTF_8;
}
log.info("file resolve charset:{}", charset);
lines = new BufferedReader(new InputStreamReader(rawInputStream, charset))
.lines()
.filter(StringUtils::isNotBlank)
.collect(Collectors.toList());
} catch (Exception e) {
log.error("resolve error", e);
}By handling the main‑thread file reading and automatically detecting the file encoding, uploads become stable in both synchronous and asynchronous contexts, eliminating FileNotFound and garbled‑text problems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Lin is Dream
Sharing Java developer knowledge, practical articles, and continuous insights into computer engineering.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
