Mastering Object Storage: From Simple OSS Uploads to Advanced Multipart Strategies
This guide walks backend engineers through the fundamentals of object storage, showing how to perform quick OSS uploads, implement multipart uploads for large files, prevent duplicate uploads with MD5 checks, and secure files against malicious content using header validation, content scanning, and bucket isolation.
Opening: Why backend can’t avoid object storage?
A product manager asked, “User wants to upload a 10 GB design file—store it in the database?” The answer is object storage, the “cloud mansion” for files: more durable than local storage, larger than a database, and comes with CDN acceleration.
Object storage: not a folder cabinet
Key point: OSS/S3 is not a local folder. It stores files as objects (data + metadata) in a distributed cluster, similar to a smart parcel locker.
Advantages: high durability (multiple replicas), petabyte‑scale capacity, pay‑as‑you‑go pricing.
Typical scenarios: image/video storage, log backup, user file uploads (a daily backend need).
Practical 1: Alibaba Cloud OSS simple upload (5‑minute starter)
Skip the official docs; here is the essential Java code.
1. Set up environment (Maven dependency)
<dependency>
<groupId>com.aliyun.oss</groupId>
<artifactId>aliyun-sdk-oss</artifactId>
<version>3.15.1</version>
</dependency>2. Core upload code (as easy as sending a parcel)
public class OssSimpleUpload {
// Do NOT hard‑code keys! Put them in config.
private static final String ENDPOINT = "oss-cn-beijing.aliyuncs.com";
private static final String ACCESS_KEY = "yourAK";
private static final String SECRET_KEY = "yourSK";
private static final String BUCKET_NAME = "yourBucket";
public static void uploadFile(File file) {
OSS ossClient = new OSSClientBuilder().build(ENDPOINT, ACCESS_KEY, SECRET_KEY);
try {
ossClient.putObject(BUCKET_NAME, "user-uploads/" + file.getName(), new FileInputStream(file));
Date expiration = new Date(System.currentTimeMillis() + 3600 * 1000);
String url = ossClient.generatePresignedUrl(BUCKET_NAME, "user-uploads/" + file.getName(), expiration).toString();
System.out.println("File uploaded successfully! Link: " + url);
} catch (Exception e) {
e.printStackTrace();
} finally {
// Always shut down the client to avoid leaks
ossClient.shutdown();
}
}
}Pitfalls
Do not set bucket permission to public read; otherwise anyone can download your files.
Use RAM sub‑account keys with the minimal upload permission.
Large file upload: direct upload is not feasible
Uploading a 2 GB video with the simple code fails when the network blips—uploads break and must restart, causing timeouts. The solution is multipart upload.
Multipart upload: three steps to move big files
Example: a 100 MB file split into ten 10 MB chunks.
Step 1: Chunking (frontend slices, backend rests)
Frontend uses File.slice():
// 10 MB per chunk
const chunkSize = 10 * 1024 * 1024;
const chunks = Math.ceil(file.size / chunkSize);
// First chunk: file.slice(0, chunkSize)Step 2: Upload each chunk (with identifiers)
Each chunk must carry: fileMd5: hash of the whole file (for deduplication). chunkIndex: sequence number (0, 1, 2 …). totalChunks: total number of chunks.
The backend stores them temporarily, e.g., /tmp/oss-chunks/${fileMd5}/${chunkIndex}.
Step 3: Merge chunks (OSS does the work)
public void mergeChunks(String fileMd5, String fileName, int totalChunks) {
OSS ossClient = new OSSClientBuilder().build(ENDPOINT, ACCESS_KEY, SECRET_KEY);
InitiateMultipartUploadResult initResult = ossClient.initiateMultipartUpload(new InitiateMultipartUploadRequest(BUCKET_NAME, "user-uploads/" + fileName));
String uploadId = initResult.getUploadId();
List<PartETag> partETags = new ArrayList<>();
for (int i = 0; i < totalChunks; i++) {
UploadPartResult uploadResult = ossClient.uploadPart(new UploadPartRequest()
.withBucketName(BUCKET_NAME)
.withKey("user-uploads/" + fileName)
.withUploadId(uploadId)
.withPartNumber(i + 1)
.withInputStream(new FileInputStream("/tmp/oss-chunks/" + fileMd5 + "/" + i)));
partETags.add(uploadResult.getPartETag());
}
ossClient.completeMultipartUpload(new CompleteMultipartUploadRequest(BUCKET_NAME, "user-uploads/" + fileName, uploadId, partETags));
ossClient.shutdown();
FileUtils.deleteDirectory(new File("/tmp/oss-chunks/" + fileMd5));
}Soul Question 1: How to avoid duplicate uploads?
When a user accidentally uploads the same file multiple times, use the file’s MD5 as an “identity card”.
Process
Frontend computes the whole‑file MD5 (large files can compute per chunk and combine).
Before uploading, call a backend endpoint like /check-file?md5=xxx.
Backend looks up a DB mapping MD5 → stored path.
If found, return the existing link and skip upload.
If not, allow the multipart upload.
Note: OSS’s ETag can serve as a simple MD5, but for multipart files the ETag changes, so storing your own MD5 is more reliable.
Soul Question 2: How to prevent malicious files (trojans/viruses)?
Three‑layer defense:
1. Header validation (magic numbers)
public boolean checkFileHeader(File file) {
byte[] header = new byte[8];
try (FileInputStream fis = new FileInputStream(file)) {
fis.read(header);
String headerHex = bytesToHex(header);
// JPG: FFD8FF, PNG: 89504E47, GIF: 47494638
return headerHex.startsWith("FFD8FF") || headerHex.startsWith("89504E47") || headerHex.startsWith("47494638");
} catch (Exception e) {
return false;
}
}2. Content scanning
Open‑source tool: ClamAV for virus scanning.
Cloud service: Alibaba Cloud OSS content security (detect porn, violence, malware).
3. Permission isolation
Upload first to a temporary bucket; after all checks, move to the production bucket.
Set the production bucket to disallow execution, e.g., add Content‑Disposition: attachment so files are forced to download.
Summary: Backend file‑upload survival guide
Small files: use simple OSS upload.
Large files: multipart upload (split → upload → merge).
Deduplication: MD5 pre‑check saves space and bandwidth.
Malware protection: header check + content scan + bucket isolation.
What upload pitfalls have you encountered? Share your experience in the comments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
