Inside Maoyan’s Near‑Real‑Time Transaction Data Center
The article details Maoyan’s transaction data center, explaining its background, the need for a unified real‑time order model, the benefits of reduced coupling and improved data accuracy, and describes the system’s architecture, components, data collection, processing, task scheduling, monitoring, and future plans.
1. Background and Current Status of Maoyan Transaction Data
1.1 What is the Transaction Data Center?
The transaction data center collects all order‑related data generated during transactions, including order details, fund details, settlement details, payment details, product details, merchant and cinema information. It extends the order center by aggregating all related data into a wide order table for use by various business departments.
1.2 Why Build a Transaction Data Center?
1) Unified complex order model and low data timeliness. Existing big‑data pipelines lack real‑time capability and have multiple data models across different sources, making unified modeling difficult. The transaction data center uses RPC interfaces to obtain transaction data, standardize the model, and improve timeliness.
2) Reduce coupling between business services and finance/reconciliation/settlement services. These downstream services currently depend on heterogeneous data models from multiple business systems, leading to high complexity and maintenance cost. The data center unifies data acquisition, performs strong validation, creates a comprehensive wide order table, and decouples downstream services.
1.3 Benefits of the Transaction Data Center
Unified order data model and near‑real‑time processing, eliminating delays and multi‑model inconsistencies.
Reduced coupling between finance, reconciliation, settlement and business services.
Data validation and anomaly monitoring ensure accuracy and completeness.
Provides convenient data support for various business and operations teams.
2. Problems and Goals
2.1 Problem Summary
Financial reporting requires precise data; historical data from big data pipelines may differ from actual transaction data, causing inaccurate reports.
The existing settlement system is fragmented across multiple business lines, lacking a unified rule set due to disparate data sources.
From the e‑commerce middle‑platform perspective, there is no complete order data collection and processing system; data is scattered across multiple business systems.
2.2 System Goals and Challenges
Build a complete, accurate, near‑real‑time transaction data collection system that gathers order, fund, merchant, inventory, settlement, and payment information, stores it reliably, and provides multi‑dimensional monitoring with automatic compensation for missing data.
2.3 Challenges
Key challenges include high performance under massive concurrency, high extensibility for new business data, near‑real‑time processing, data integrity, and handling large data volumes.
High performance: ensure fast data collection without affecting user transactions.
High extensibility: accommodate new business data with minimal cost.
Near‑real‑time: process collected data quickly.
Integrity: guarantee correctness and prevent data loss.
Data volume: ensure storage can handle large scales.
3. Technical Design and Implementation
3.1 System Architecture
Overall architecture diagram:
Major components:
业务系统:业务系统是指数据中心数据采集来源业务方,主要包含中台交易、资金流、分账、支付几个大的业务平台Data processing is divided into Receiver (data collection) and Processor (data transformation and storage).
Data storage uses MySQL for raw and processed metadata and Elasticsearch for search.
3.2 Functional Architecture
Key functional modules include Receiver, CheckerEngine, Processor, GeneratorEngine, Argos (monitoring/alert), Crane (task scheduling), and MySQL & Eagle (storage).
3.2.1 Functional Component Overview
Receiver: collects, parses, validates, and stores business data, ensuring idempotency.
Processor: parses, validates, assembles, and stores task data, monitors failures, and triggers alerts.
Checker Engine: validates data fields during ingestion and ensures task readiness.
Generator Engine: assembles metadata into complete order records during task execution.
Argos: monitors anomalies such as data loss, processing failures, and triggers alerts.
Fuxi system: provides workflow orchestration for multi‑step order processes with retry mechanisms.
Crane: high‑availability, sharded distributed task scheduler supporting multiple job types.
3.3 Data Collection
3.3.1 Data Validation
During collection, each field is strictly checked before being stored in the metadata table.
3.3.2 Idempotency
Idempotent checks prevent duplicate ingestion of the same transaction metadata.
3.3.3 Task Generation Rules
Based on collected metadata, tasks are generated for various order states (e.g., pending payment, paid, refund). Tasks are linked to metadata via orderId, operationType, and a unique identifier.
3.3.4 Preventing Data Loss and Write Failures
Data loss: compare with external systems to ensure all orders are captured; missing data triggers alerts.
Write failures: capture failure reasons, monitor, and alert to avoid internal loss.
3.3.5 Data Correction or Re‑write
If business data is erroneous, a new task is created to overwrite the old data with corrected information.
3.4 Data Processing
Processing includes parsing, assembling, and storing data into real‑time and snapshot tables, then indexing into Elasticsearch.
Processing steps:
1) Receive business receipts, validate, generate task and metadata.
2) Crane schedules tasks, submits to Fuxi.
3) Fuxi parses and assembles metadata, updates real‑time tables.
4) Snapshots are created for historical reference.
5) Data is indexed into Elasticsearch and task status set to completed.
3.4.1 Real‑time Table vs Snapshot Table
Real‑time table holds the latest state per order; snapshot table stores historical states per operation type for up to three months.
3.4.2 High‑Concurrency Solution
Optimistic locking (version field) and a state‑machine model ensure ordered, consistent updates under high concurrency.
3.4.3 Complex Multi‑Condition Queries
Because order data has >100 fields, Elasticsearch is used for efficient multi‑dimensional queries, reducing DB load.
3.4.4 Task Failure Handling
Failed tasks are retried automatically; if maximum retries are reached, failures are logged, aggregated, and alerted.
3.4.5 Task Recovery Mechanism
When business data is later supplied for a previously failed task, the system resets and re‑executes the task.
3.5 State‑Machine Model
A state‑machine ensures tasks execute in the correct order, preventing data inconsistency and loss.
3.6 Business Data Overview
Each order operation type follows three processes: metadata validation, task submission validation, and task execution (storage in MySQL and ES).
3.7 Database Table Design
Tables include Task Table, Metadata Table, Real‑time Table, and Snapshot Table, each serving specific stages of data lifecycle.
3.8 Crane Task Scheduling Design
3.8.1 Design Background
Near‑real‑time data collection requires fast task fetching and submission; Crane provides sharded, parallel scanning across many tables to accelerate processing.
3.8.2 Design Principle
Crane distributes tasks across 40 machines with configurable shards; each machine fetches tasks from its assigned shards, improving throughput while balancing DB and Fuxi load.
3.8.3 Task Scheduling Process
Tasks are fetched from the database, submitted to the Fuxi workflow engine, which orchestrates execution.
3.8.4 Idempotent Task Submission
Distributed locks based on operationType and orderId prevent duplicate submissions; locks expire after 60 minutes if not released.
3.8.5 Shard Count Considerations
Increasing shards improves parallelism but raises DB and Fuxi load; optimal shard count balances performance and resource constraints.
3.9 Engine Implementation
3.9.1 Checker Engine
Uses strategy pattern to apply different validation rules (e.g., SubmitChecker, CancelChecker) for each order stage.
3.9.2 Generator Engine
During task execution, retrieves metadata, assembles it into a complete order record, also using strategy pattern (e.g., SubmitGenerator, PayGenerator).
3.10 Monitoring and Alert Mechanism
Multiple monitoring rules detect data anomalies, task failures, illegal fields, and inconsistencies, triggering alerts and daily email summaries for rapid resolution.
4. Future Plans
4.1 Technical Upgrades
Minimize business dependencies, achieve core data self‑collection.
Address higher performance demands for greater QPS and larger data volumes.
Enable zero‑code business onboarding via configuration.
Enhance automated data repair to reduce manual intervention.
4.2 Business Expansion
Continue integrating remaining business lines, aiming to collect all Maoyan transaction data in a single platform.
4.3 Data Application Plans
Support unified reconciliation, settlement, and EBS services.
Provide visual dashboards and order query capabilities.
Supply data to big‑data, risk control, and other teams.
Power real‑time data dashboards for the transaction middle‑platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
