How to Build a Scalable Payment Reconciliation System for Tens of Millions of Daily Transactions

This article explains the design of a high‑throughput payment reconciliation system that handles daily transaction volumes in the tens of millions, covering architecture, data collection, big‑data processing with Hive and Spark, handling of suspicious records, and integration with a data platform.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
How to Build a Scalable Payment Reconciliation System for Tens of Millions of Daily Transactions

Payment Reconciliation Overview

Payment reconciliation is the process of matching payment records generated by a merchant with those provided by a third‑party payment channel to ensure consistency and detect anomalies.

Why Reconciliation Is Needed

It serves as the final safety net for payment systems, allowing early detection of mismatched or missing transactions caused by network issues, system bugs, or day‑cut discrepancies.

System Architecture

The system consists of two main modules: the Reconciliation Module (responsible for pulling files, parsing data, matching records, and aggregating results) and the Error Module (handling unmatched records and generating error orders for manual review).

Challenges at Ten‑Million‑Level Data

Query efficiency degrades as daily data grows, leading to long reconciliation times.

Memory consumption can cause OOM errors when loading both local and remote data into memory.

Single‑machine processing limits scalability and wastes resources.

Solution Using Big Data Technologies

Traditional OLTP databases like MySQL are unsuitable for massive, batch‑oriented reconciliation tasks. Instead, the system leverages OLAP platforms such as Hive and Spark SQL to process large datasets efficiently.

Data Platform (DP) Integration

The DP provides offline task scheduling, data synchronization between MySQL and Hive, and supports SQL‑based data cleaning, transformation, and aggregation.

System Workflow Overview

The workflow is driven by a state machine, with each stage triggered by scheduled tasks. Key stages include initialization, data collection, suspicious data handling, DP import, DP verification, second‑level suspicious processing, data summarization, and error pushing.

1. Initialization

Each night a reconciliation task record is created based on active rules (channel_code, biz_type, status). The batch number is generated as bill_date + channel_code + "001". The task progresses through phases such as "Initialization", "Data Collection", "Suspicious Handling", "Data Verification", etc.

2. Data Collection

Data is gathered from two sources:

Local data : payment records from the merchant's own databases, captured via binlog listeners and stored in a table with fields like channel_code, biz_order_no, bill_date, status, is_check, trade_amount, channel_order_no, merchant_no, sub_merchant_no.

Remote data : third‑party channel files downloaded via FTP/SFTP/HTTP, parsed, and stored in a similar table with an additional channel_fee field.

3. Suspicious Data Handling

Records that cannot be matched (e.g., day‑cut mismatches, amount differences, missing counterpart) are stored in a "suspicious" table with fields such as batch_no, biz_id, biz_amount, status, biz_date, biz_type, channel_code, delayed_times, buffer_type (0 for local, 1 for channel).

4. DP Import

After local and suspicious data are ready, a scheduled DP job imports the MySQL tables into Hive for batch verification.

5. DP Verification

Two Hive jobs run:

Successful match : inner join on biz_order_no, trade_amount, and channel_code to create dp.pay_check_success.

CREATE TABLE IF NOT EXISTS dp.pay_check_success (...);
INSERT OVERWRITE TABLE dp.pay_check_success SELECT ... FROM dp.pay_check_record tb1 INNER JOIN dp.pay_check_channel_record tb2 ON ... WHERE tb1.is_check=0 AND tb2.is_check=0 ...;

Suspicious match : left joins to find records present on only one side, inserting them into dp.check_dp_buffer_record with an order_type flag (0 for local, 1 for channel).

CREATE TABLE IF NOT EXISTS dp.check_dp_buffer_record (...);
INSERT OVERWRITE TABLE dp.check_dp_buffer_record SELECT ... FROM ods.pay_check_record LEFT JOIN ods.pay_check_channel_record ON ... WHERE tb2.biz_order_no IS NULL UNION SELECT ... FROM ods.pay_check_chnnel_bill LEFT JOIN ods.pay_check_record ON ... WHERE tb2.biz_order_no IS NULL;

6. Second‑Level Suspicious Processing

The buffer table is scanned; for each entry the system looks for a counterpart of opposite order_type. If a matching record with a different amount is found, it is moved to the error table; otherwise it remains as genuine suspicious data.

7. Data Summarization

Spark jobs aggregate counts of successful, suspicious, and error records for daily dashboards.

8. Error Data Push

Unresolved error records are sent via NSQ messages to a dedicated error‑handling system.

Conclusion

Building a ten‑million‑level payment reconciliation system requires a robust big‑data platform, careful separation of batch and real‑time processing, and systematic handling of mismatches. Integrating the reconciliation service with a data platform like DP enables scalable, reliable, and maintainable operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Reconciliationsystem designpayment
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.