Big Data 11 min read

iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices

The iQIYI Basic Data Platform unifies internal data exchange standards, integrates massive multi‑business data, and implements high‑availability solutions for ID services, messaging, HBase storage, and read‑write scaling, showcasing practical engineering approaches to big‑data reliability and performance.

High Availability Architecture
High Availability Architecture
High Availability Architecture
iQIYI Basic Data Platform: Architecture, High Availability, and Service Practices

iQIYI's Basic Data Platform was built to unify internal data exchange standards, solving problems of inconsistent IDs, data definitions, and delayed updates across teams.

As the business expanded, the platform now integrates data from UGC videos, movies, live streams, games, literature, e‑commerce, etc., supporting massive storage, distribution, online queries and offline analysis.

Currently it manages nearly a hundred tables, with billions of records, daily growth of millions of rows and tens of millions of messages, serving dozens of business teams.

Service Capabilities

The platform provides unified HTTP/RPC access, a management console for table definition, data volume, message volume, change logs, real‑time queries, and a field‑definition management system.

Overall Architecture

Access layer: HTTP/RPC protocols, unified message listening and offline scan SDK.

Management platform: development tools for table definitions, real‑time queries, and field management.

Service governance: fine‑grained permission and traffic control.

Service Process

Data tables are defined via the management platform and a Protobuf data‑definition package is published. Production services write data through ID and write services, which store it in HBase and emit update notifications. Downstream services consume messages, retrieve the latest data, and handle versioning via SessionID.

Solutions to Key Issues

ID Service High Availability: Two MySQL clusters generate odd and even IDs respectively, ensuring service continuity if one cluster fails.

Message Distribution: A custom ActiveMQ plugin routes messages to queues based on subscription rules, similar to AOP.

HBase Read Performance: Introduced a MongoDB cache (WAL) with TTL and set the SessionID column family to IN_MEMORY to reduce read pressure.

HBase Availability: Designed a same‑city active‑standby HBase setup with data synchronization, MongoDB write‑ahead log, and a Synchronizer service to keep primary and standby clusters consistent.

ActiveMQ Limitations: Replaced ActiveMQ with RocketMQ for better scalability, reliability, and server‑side filtering, deploying a three‑data‑center cluster.

Read Capacity Expansion: Added a business‑level slave read service that synchronizes data from the primary store, allowing horizontal scaling without traditional read‑write splitting.

Conclusion

The iQIYI Basic Data Platform continuously improves its technology and service solutions to address real‑world challenges, accumulating practical experience with HBase, RocketMQ, and high‑availability designs, and will keep exploring ways to enhance service capability, stability, and performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsBig DataHBaseMessagingService Architecture
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.