How iQIYI Built a Scalable OLTP Data Center to Eliminate Data Silos
This article details iQIYI's design and implementation of a unified OLTP data center that consolidates data across business lines, solves data‑island issues, ensures strong consistency between MongoDB and Elasticsearch, and provides high‑availability, massive‑scale storage for billions of records.
Background
Micro‑service architectures create data islands because each service maintains its own datastore. iQIYI built a unified OLTP data center to centralize all writes, broadcast field‑change events, and provide a single source for analytics and business logic.
Design Goals
Eliminate data islands and provide a holistic view of content operations.
Avoid duplicated infrastructure across business lines.
Standardize interface specifications to simplify integration.
Reduce service coupling and development complexity.
Allow developers to concentrate on business features.
Architecture
All services write to the OLTP cluster. The cluster stores data in MongoDB and Elasticsearch, then emits field‑change events via RabbitMQ. Any service can act as a writer, a reader, or both.
Consistency Model
Final‑consistency is achieved with field‑level ownership. The original payload is first saved to MongoDB, then transformed according to a predefined MongoDB‑to‑ES field mapping and fully overwrites the corresponding Elasticsearch document, preventing partial updates.
High Availability
MongoDB is sharded across multiple data‑center locations to achieve active‑active resilience. A configurable dual‑write mechanism writes simultaneously to a standby cluster; a database‑switch flag enables automatic failover. An offline data warehouse reads from the backup cluster at configurable snapshot times (e.g., 09:00, 12:00, 18:00) for large‑scale reporting.
Massive Data Storage
DB Layer
Sharded MongoDB clusters provide horizontal scalability; each shard holds a collection limited only by storage capacity, supporting billions of rows.
ES Layer
Indexes are created with a fixed number of primary shards. To avoid oversized shards, an alias points to multiple real indexes. Writes are routed to the appropriate real index while reads use the alias, enabling seamless scaling without service disruption.
Best Practices
Standardized Message Notification
The OLTP SDK wraps RabbitMQ, filters and deduplicates messages, and supports second‑level field filtering so that clients receive only relevant notifications.
Encapsulates RabbitMQ to reduce client complexity.
Moves filtering and deduplication to the SDK, easing server load.
Provides precise field‑level subscription.
Technical Challenges
Optimistic lock granularity : Locks are scoped per business unit, preventing cross‑unit contention while preserving consistency.
Elasticsearch bulk write : Uses the Bulk API (
https://www.elastic.co/guide/en/elasticsearch/reference/7.6/docs-bulk.html#docs-bulk) to achieve tens‑fold higher write throughput.
Deep pagination : Employs Search After for real‑time deep paging and the Scroll API for high‑performance batch retrieval.
Business‑level rate limiting : Implements distributed throttling per business unit to isolate traffic spikes.
Results
Data islands eliminated; all content‑operation data is visible.
Business lines only push data to the OLTP center, removing duplicate development.
Consumers retrieve any required data via a unified API, simplifying integration.
Service coupling dramatically reduced as teams no longer need direct data exchange.
Production currently serves 26 business lines across four isolated clusters, handling read QPS >2000 and write QPS >500, with the largest table containing ~250 million rows. Ongoing work integrates the OLTP platform with an OLAP system to provide real‑time and offline analytical capabilities.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
