Operations 9 min read

Evolution of Qiniu Cloud Data Processing Architecture

The article explains how Qiniu's data processing platform has evolved from a simple real‑time URL‑based model to a more complex architecture featuring separate caching, agent services, discover monitoring, and container‑based elastic scaling to handle massive unstructured data workloads.

Architect
Architect
Architect
Evolution of Qiniu Cloud Data Processing Architecture

According to statistics, internet data volume doubles every three years, with over 95% being unstructured, driving high demands on data processing; Qiniu, covering more than 50% of Chinese internet users, offers a data management platform whose architecture has evolved over time.

Qiniu cloud storage splits data processing into real‑time and asynchronous modes, recommending real‑time handling for small files such as images.

Real‑time processing is performed by appending operation parameters to the file URL (e.g., http://xxx.com/key?fop/param1/value/param2/name/value), with optional style shortcuts or pipeline techniques for complex operations.

Asynchronous requests are issued via the portal, command‑line tools, or SDKs, suitable for large media files; the service returns metadata (bucket, key, etc.) and can deliver results via a callback server or by polling the processing status.

The basic real‑time architecture receives a request through Qiniu's I/O entry, validates it, forwards it to a scheduling service, which distributes it to a compute cluster; each worker checks parameters, optionally downloads the original data, applies algorithms or tools (e.g., transcoding, thumbnailing), and returns the result, persisting it in object storage for asynchronous cases.

This design, while clear, creates a heavy scheduling service that must also manage many workers, cache traffic, and shared configuration for object‑storage endpoints, leading to maintenance challenges as the number of workers grows.

To address these issues, the architecture separates the Data Cache, allowing SSD for small hot files and HDD for large cold files.

An agent service is added per server to track which workers reside on which machines, handle data downloads, and write to the cache, relieving the scheduler of these responsibilities.

Failed requests are also cached to provide quick responses and reduce compute load, though this adds some pressure to the cache and required refining the error‑handling flow to avoid unnecessary downloads.

A Discover service collects heartbeat information from agents and workers; the load balancer reads this data to make weighted round‑robin decisions, and administrators can manually adjust service status via Discover.

The asynchronous architecture adds a queue service and request‑status service in front of the real‑time pipeline, persisting each worker’s output and returning file metadata.

Despite these improvements, the current system still assigns a fixed number of workers per machine, lacking cross‑machine elastic scheduling, which can lead to under‑utilized resources during off‑peak periods.

Future plans involve containerizing agents and workers so each runs in its own container, allowing dynamic placement of containers on any server based on resource availability, thereby turning the compute cluster into a flexible resource pool and eliminating machine‑bound worker constraints.

Source: segmentfault (original article: http://segmentfault.com/a/1190000004092239).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Real-time Processingdata-processingload balancingcontainerizationcloud architecture
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.