Designing a Resumable Large‑File Upload API for Private Enterprise

An in‑depth guide walks through the challenges of enterprise‑grade large file uploads—covering chunked transfer, resumable uploads, security, audit trails, and a complete set of RESTful endpoints with database schema, state‑machine handling, and both local and cloud storage integration for AI‑driven document processing.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Designing a Resumable Large‑File Upload API for Private Enterprise

Background

Private AI deployments for government and enterprise customers often need to ingest massive document collections (Word, PDF, PPT, Markdown) that can reach tens of gigabytes. Uploads occur inside LANs or completely offline environments, must not pass through public cloud storage, and require full audit trails (who uploaded what and when). A simple "single upload endpoint + cloud storage" approach fails because it cannot handle resumable uploads, cluster‑wide chunk merging, or strict security and compliance requirements.

Frontend upload techniques

Instant‑upload check (hash‑based deduplication).

Chunked upload – fixed‑size slices (e.g., 5 MB or 10 MB).

Breakpoint resume – the client records which chunks have been uploaded and only sends missing ones after a network interruption.

Concurrent chunk upload – typically 3‑5 chunks in parallel to improve throughput.

Real‑time progress display.

Backend API design

The backend provides a set of focused endpoints that match the frontend workflow.

/upload/check – Instant‑upload check

POST /api/upload/check
{
  "fileHash": "md5_abc123def456",
  "fileName": "training-docs.zip",
  "fileSize": 5342245120
}
{
  "success": true,
  "data": { "exists": false }
}

If exists is true, the file is already stored and the client can skip the upload.

/upload/init – Initialise upload task

POST /api/upload/init
{
  "fileHash": "md5_abc123def456",
  "fileName": "training-docs.zip",
  "totalChunks": 320,
  "chunkSize": 5242880
}
{
  "success": true,
  "data": {
    "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b",
    "uploadedChunks": []
  }
}

The returned uploadId uniquely identifies the upload session and is used for all subsequent calls.

/upload/chunk – Upload a single chunk

POST /api/upload/chunk
Content-Type: multipart/form-data

formData:
  uploadId: b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b
  chunkIndex: 0
  chunkSize: 5242880
  chunkHash: md5_001
  file: (binary data)
{
  "success": true,
  "data": {
    "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b",
    "chunkIndex": 0,
    "chunkSize": 5242880
  }
}

Each successful upload creates a record in upload_chunk and increments uploaded_chunks in the corresponding upload_task.

/upload/merge – Merge all chunks

POST /api/upload/merge
{
  "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b",
  "fileHash": "md5_abc123def456"
}
{
  "success": true,
  "message": "文件合并成功",
  "data": { "storagePath": "/data/uploads/training-docs.zip" }
}

The server validates that all chunks are present, performs the merge (local concatenation or cloud‑side multipart‑complete), verifies the final MD5 against fileHash, and marks the task as COMPLETED.

/upload/pause – Pause a task

POST /api/upload/pause
{ "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b" }
{ "success": true, "message": "任务已暂停" }

/upload/cancel – Cancel a task

POST /api/upload/cancel
{ "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b" }
{ "success": true, "message": "任务已取消" }

/upload/list – List upload tasks (admin view)

GET /api/upload/list
{
  "success": true,
  "data": [
    {
      "uploadId": "b4f8e3a7-1a0c-4a1d-88af-61e98d91a49b",
      "fileName": "training-docs.zip",
      "status": "COMPLETED",
      "uploadedChunks": 320,
      "totalChunks": 320,
      "uploader": "admin",
      "createdAt": "2025-10-20 14:30:12"
    }
  ]
}

Database schema

Three core tables store the complete lifecycle.

upload_task – one row per upload session; key fields include upload_id, file_hash, file_name, file_size, chunk_size, total_chunks, uploaded_chunks, status (0‑7), storage_type, storage_url, local_path, uploader, timestamps.

upload_chunk – one row per chunk; fields: upload_id (FK), chunk_index, chunk_size, optional chunk_hash, status (0‑2), local_path, timestamps.

file_info – final file metadata after successful merge; fields: file_hash, file_name, file_size, storage_type, storage_url, uploader, status, timestamps.

Relationships: upload_taskupload_chunk (one‑to‑many) and upload_taskfile_info (one‑to‑one after merge).

Upload state machine

WAITING (0)

– task created, no chunks uploaded. UPLOADING (1) – chunks are being received; uploaded_chunks is updated. PAUSED (7) – user‑initiated pause; chunks remain on disk. CANCELED (4) – user aborts; temporary files may be deleted. MERGING (2) – all chunks present, server is merging. CHUNK_MERGED (6) – merge succeeded, optional post‑processing. COMPLETED (3) – file merged, hash verified, final path stored. FAILED (5) – any error during upload, merge or verification.

Recovery workflow:

Client computes file hash and calls /upload/check.

If a file_info record exists, the client performs an instant upload.

If a task exists in upload_task, the client retrieves uploadId and queries upload_chunk for already uploaded indices.

The client resumes only the missing chunks.

Merge and integrity verification

Local merge

When storage_type=local, the server opens the target file and streams each chunk in chunk_index order. After concatenation it recomputes the MD5 and compares it with the original fileHash. A mismatch marks the task FAILED and logs an error.

Cloud merge

For object storage (OSS, COS, MinIO, etc.) the server calls the provider’s multipart‑complete API (e.g., completeMultipartUpload). The provider guarantees correct ordering and integrity; the server only records the resulting storage_url.

Cluster deployment strategies

Shared storage – all nodes write chunks to a common NFS/NAS path (e.g., /data/uploads) so any node can perform the merge.

Cloud‑side merge – chunks are stored directly in object storage; merge is performed by the cloud service.

Dedicated merge node – a scheduler assigns a specific node to pull chunks from other nodes via internal RPC and execute the merge.

Private‑cloud environments typically use the shared‑storage approach for performance and security.

Asynchronous processing & performance optimisation

In production the merge, hash verification, and downstream AI processing (document parsing, paging, vectorisation) are off‑loaded to background workers or task queues. The upload endpoints remain lightweight and return immediately after receiving a chunk, preventing front‑end time‑outs and reducing peak I/O pressure. Failed tasks can be retried automatically from the persisted database state.

Summary

The design transforms a simple single‑endpoint upload into a robust, auditable, resumable large‑file upload service suitable for private, AI‑driven enterprise environments. By delegating slicing, progress, and resume to the frontend and handling storage, verification, merging, and audit trails on the backend, the system remains extensible, cluster‑ready, and compliant with strict security policies.

File Uploaddatabase designEnterpriseresumable uploadLarge FilesBackend API
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.