How iQIYI’s QBFS Enables Seamless Hybrid‑Cloud Storage and Cuts Big‑Data Costs by Over 30%
iQIYI’s big‑data team built a self‑developed QBFS virtual file system that unifies private and multiple public clouds, providing transparent routing, automatic migration, intelligent caching and fine‑grained governance, which together reduce storage and compute costs by more than 30 % while supporting scalable analytics.
Background
As iQIYI’s data volume grew, the private‑cloud infrastructure faced limitations such as fixed rack capacity, slow scaling, and hardware lock‑in. Starting in 2023, the team introduced a hybrid‑cloud model to leverage elastic public‑cloud resources, newer instance types, and cold storage, while maintaining a unified data access experience.
Overall Architecture
The core component is QBFS (iQIYI Big‑Data File System), a virtual file system that offers a single namespace across heterogeneous storage back‑ends. It abstracts underlying file systems and clusters, enabling seamless data routing between on‑premise HDFS, private‑cloud object stores, and multiple public‑cloud object stores.
QBFS Components
QBFS Client / Router : Clients interact with storage services via the QBFS client, which resolves routing and configuration to select the appropriate storage cluster.
Data Cache Layer : Built on Alluxio, it transparently caches data from persistent storage.
Persistent Storage : Includes HDFS and private‑cloud/object stores across multiple regions and AZs (referred to as UFS).
Data Scheduling Service : Implements hot‑cold tiering, transparent migration, and backup based on QBFS routing.
Metadata Service : Consolidates metadata from all UFS, supporting directory statistics, growth analysis, and accelerated lookups.
Unified Client Access
The QBFS client bundles SDKs for all underlying storage services. When an application calls the HCFS interface, the client performs routing resolution, protocol translation, caching, and finally invokes the appropriate UFS client.
It is packaged as a provided dependency for Oozie, Kyuubi, etc., allowing centralized maintenance and upgrades.
Hybrid‑Cloud Authentication & Authorization
QBFS supports AK/SK authentication of public‑cloud object stores while applications use Kerberos. The system automatically converts Kerberos tickets to temporary AK/SK tokens for non‑YARN tasks via CredentialProvider and SPNEGO, and for YARN tasks via S3A delegation tokens wrapped as Spark tokens.
Permissions are managed at the QBFS namespace level; the system maps these to the underlying UFS permission engines, handling path changes and view updates automatically.
Hybrid‑Cloud Storage Layers
Based on observed data hotness (recent partitions accessed heavily, older partitions rarely accessed), QBFS classifies data into three tiers: standard, low‑frequency, and archival. Standard data stays on private‑cloud HDFS, low‑frequency data moves to public‑cloud object storage, and archival data is stored in cloud‑provided archive storage.
One‑to‑Many Routing
QBFS implements a one‑to‑many routing strategy similar to HDFS RBF. A mount point can map to multiple UFS. Write‑cluster selection follows a “FIRST” policy: new top‑level directories go to the first UFS, deeper directories inherit the parent’s UFS. Time‑partition directories are always created on private‑cloud HDFS.
Write‑Cluster Selection Rules
If creating a file or directory directly under a mount point (level = 1), it is placed in the first UFS.
If creating deeper paths, the nearest existing parent directory’s UFS is used.
Time‑partition directories are always created on private‑cloud HDFS.
Multi‑Cluster Invocation Strategies
invokeSingle : Call a single mount path.
invokeSequential : Try each mount path in order until one succeeds.
invokeConcurrent : Call all mount paths concurrently and merge results.
These strategies preserve correct file‑access semantics across multiple back‑ends.
Low‑Frequency Data Transfer
QBFS periodically moves cold partitions from private‑cloud HDFS to public‑cloud low‑frequency storage. The transfer follows a multi‑step process: copy to a temporary directory on the target UFS, disable writes on the source, verify data integrity, move the target to its final location, move the source to a temporary location, re‑enable writes, and finally clean up the source.
The system handles idempotent retries, concurrent accesses, data consistency, and traffic throttling.
In production, the scheduler generates about 100 000 partition transfer tasks per day.
Archive Storage Management
Data that exceeds a configured archive TTL is moved to public‑cloud archive storage. The process copies the partition to a temporary location, validates integrity, moves it to the archive path, and removes it from the active table. For data already in the cloud, the copy step is skipped.
QBFS also supports restoring archived data either in‑place (original path) or to an independent table, handling both “cold‑to‑hot” transitions and temporary unfreeze of archive objects.
Transparent Table‑Level Migration
Migration is performed per time‑partition. Before migration, one‑to‑many routing is configured, permissions are copied, and non‑essential tasks are paused. The migration proceeds by moving each partition to the target UFS using the same copy‑verify‑move workflow as low‑frequency transfer. After all partitions are migrated, the table’s remaining metadata (e.g., Iceberg manifests) is moved, and the routing is switched to a one‑to‑one configuration.
For non‑time‑partitioned tables (e.g., dimension tables), a traditional stop‑the‑world migration is used.
Hybrid‑Cloud Caching
To alleviate bandwidth bottlenecks, Alluxio workers are deployed on public‑cloud compute nodes, forming a cross‑cloud cache pool. QBFS detects the client’s cloud AZ and prefers local workers for read/write caching.
Mixed Resources : Public‑cloud workers provide cache capacity in each AZ.
Proximity Access : Routing logic selects the nearest worker.
Object‑Store Integration : Alluxio plugins adapt to QBFS routing, enabling seamless caching of object‑store data.
Cache‑Aware Task Scheduling
For public‑cloud compute tasks that read private‑cloud data, QBFS groups tasks by user or queue, measures per‑core input traffic, and schedules low‑traffic groups to the cloud. Hot data for these groups is pre‑loaded into Alluxio workers using write‑through or explicit cache rules. eBPF monitoring confirms that this approach reduces cross‑line traffic peaks by about 70 %.
Results and Outlook
The hybrid‑cloud storage solution is fully deployed at iQIYI, delivering over 30 % cost reduction for big‑data workloads. The architecture supports rapid onboarding of new public‑cloud object stores, fine‑grained governance, and dynamic tiering. Future work includes expanding multi‑cloud data placement, tighter compute‑storage co‑location, and leveraging cloud‑native AI data pipelines.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
