Mastering Object Storage: From Simple OSS Uploads to Advanced Multipart Strategies

This guide walks backend engineers through the fundamentals of object storage, showing how to perform quick OSS uploads, implement multipart uploads for large files, prevent duplicate uploads with MD5 checks, and secure files against malicious content using header validation, content scanning, and bucket isolation.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mastering Object Storage: From Simple OSS Uploads to Advanced Multipart Strategies

Opening: Why backend can’t avoid object storage?

A product manager asked, “User wants to upload a 10 GB design file—store it in the database?” The answer is object storage, the “cloud mansion” for files: more durable than local storage, larger than a database, and comes with CDN acceleration.

Object storage: not a folder cabinet

Key point: OSS/S3 is not a local folder. It stores files as objects (data + metadata) in a distributed cluster, similar to a smart parcel locker.

Advantages: high durability (multiple replicas), petabyte‑scale capacity, pay‑as‑you‑go pricing.

Typical scenarios: image/video storage, log backup, user file uploads (a daily backend need).

Practical 1: Alibaba Cloud OSS simple upload (5‑minute starter)

Skip the official docs; here is the essential Java code.

1. Set up environment (Maven dependency)

<dependency>
    <groupId>com.aliyun.oss</groupId>
    <artifactId>aliyun-sdk-oss</artifactId>
    <version>3.15.1</version>
</dependency>

2. Core upload code (as easy as sending a parcel)

public class OssSimpleUpload {
    // Do NOT hard‑code keys! Put them in config.
    private static final String ENDPOINT = "oss-cn-beijing.aliyuncs.com";
    private static final String ACCESS_KEY = "yourAK";
    private static final String SECRET_KEY = "yourSK";
    private static final String BUCKET_NAME = "yourBucket";

    public static void uploadFile(File file) {
        OSS ossClient = new OSSClientBuilder().build(ENDPOINT, ACCESS_KEY, SECRET_KEY);
        try {
            ossClient.putObject(BUCKET_NAME, "user-uploads/" + file.getName(), new FileInputStream(file));
            Date expiration = new Date(System.currentTimeMillis() + 3600 * 1000);
            String url = ossClient.generatePresignedUrl(BUCKET_NAME, "user-uploads/" + file.getName(), expiration).toString();
            System.out.println("File uploaded successfully! Link: " + url);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            // Always shut down the client to avoid leaks
            ossClient.shutdown();
        }
    }
}

Pitfalls

Do not set bucket permission to public read; otherwise anyone can download your files.

Use RAM sub‑account keys with the minimal upload permission.

Large file upload: direct upload is not feasible

Uploading a 2 GB video with the simple code fails when the network blips—uploads break and must restart, causing timeouts. The solution is multipart upload.

Multipart upload: three steps to move big files

Example: a 100 MB file split into ten 10 MB chunks.

Step 1: Chunking (frontend slices, backend rests)

Frontend uses File.slice():

// 10 MB per chunk
const chunkSize = 10 * 1024 * 1024;
const chunks = Math.ceil(file.size / chunkSize);
// First chunk: file.slice(0, chunkSize)

Step 2: Upload each chunk (with identifiers)

Each chunk must carry: fileMd5: hash of the whole file (for deduplication). chunkIndex: sequence number (0, 1, 2 …). totalChunks: total number of chunks.

The backend stores them temporarily, e.g., /tmp/oss-chunks/${fileMd5}/${chunkIndex}.

Step 3: Merge chunks (OSS does the work)

public void mergeChunks(String fileMd5, String fileName, int totalChunks) {
    OSS ossClient = new OSSClientBuilder().build(ENDPOINT, ACCESS_KEY, SECRET_KEY);
    InitiateMultipartUploadResult initResult = ossClient.initiateMultipartUpload(new InitiateMultipartUploadRequest(BUCKET_NAME, "user-uploads/" + fileName));
    String uploadId = initResult.getUploadId();
    List<PartETag> partETags = new ArrayList<>();
    for (int i = 0; i < totalChunks; i++) {
        UploadPartResult uploadResult = ossClient.uploadPart(new UploadPartRequest()
            .withBucketName(BUCKET_NAME)
            .withKey("user-uploads/" + fileName)
            .withUploadId(uploadId)
            .withPartNumber(i + 1)
            .withInputStream(new FileInputStream("/tmp/oss-chunks/" + fileMd5 + "/" + i)));
        partETags.add(uploadResult.getPartETag());
    }
    ossClient.completeMultipartUpload(new CompleteMultipartUploadRequest(BUCKET_NAME, "user-uploads/" + fileName, uploadId, partETags));
    ossClient.shutdown();
    FileUtils.deleteDirectory(new File("/tmp/oss-chunks/" + fileMd5));
}

Soul Question 1: How to avoid duplicate uploads?

When a user accidentally uploads the same file multiple times, use the file’s MD5 as an “identity card”.

Process

Frontend computes the whole‑file MD5 (large files can compute per chunk and combine).

Before uploading, call a backend endpoint like /check-file?md5=xxx.

Backend looks up a DB mapping MD5 → stored path.

If found, return the existing link and skip upload.

If not, allow the multipart upload.

Note: OSS’s ETag can serve as a simple MD5, but for multipart files the ETag changes, so storing your own MD5 is more reliable.

Soul Question 2: How to prevent malicious files (trojans/viruses)?

Three‑layer defense:

1. Header validation (magic numbers)

public boolean checkFileHeader(File file) {
    byte[] header = new byte[8];
    try (FileInputStream fis = new FileInputStream(file)) {
        fis.read(header);
        String headerHex = bytesToHex(header);
        // JPG: FFD8FF, PNG: 89504E47, GIF: 47494638
        return headerHex.startsWith("FFD8FF") || headerHex.startsWith("89504E47") || headerHex.startsWith("47494638");
    } catch (Exception e) {
        return false;
    }
}

2. Content scanning

Open‑source tool: ClamAV for virus scanning.

Cloud service: Alibaba Cloud OSS content security (detect porn, violence, malware).

3. Permission isolation

Upload first to a temporary bucket; after all checks, move to the production bucket.

Set the production bucket to disallow execution, e.g., add Content‑Disposition: attachment so files are forced to download.

Summary: Backend file‑upload survival guide

Small files: use simple OSS upload.

Large files: multipart upload (split → upload → merge).

Deduplication: MD5 pre‑check saves space and bandwidth.

Malware protection: header check + content scan + bucket isolation.

What upload pitfalls have you encountered? Share your experience in the comments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaBackend DevelopmentFile UploadOSSobject storageMultipart Upload
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.