Databases 20 min read

How a Hybrid Data Warehouse Transformed Banking Data Services

This article details the 2015 hybrid data‑warehouse design implemented at Guangdong Huaxing Bank, explaining its real‑time, historical, and archival layers, the data‑bus concept, and how mixing in‑memory, relational, and Hadoop technologies addressed modern banking data‑volume, latency, and unstructured‑data challenges.

dbaplus Community

Jun 18, 2020

How a Hybrid Data Warehouse Transformed Banking Data Services

1. Data Application Development Trends

Traditional data‑warehouse projects in banks use systems like Teradata or Greenplum to consolidate transaction data for reporting and analysis, but the rise of internet finance has flooded banks with massive structured and unstructured data, demanding faster, more flexible data services.

Huaxing Bank therefore proposed a hybrid data‑warehouse architecture that expands the warehouse into a unified data‑service center, integrating in‑memory databases, relational databases, and Hadoop to meet low‑cost, secure, agile, and automated requirements.

2. Data Warehouse Capability Design

The hybrid warehouse must support four core capabilities:

Real‑time data sharing : expose up‑to‑date asset‑liability views to channels such as online and mobile banking via APIs.

Batch data acquisition : provide daily bulk data files for reporting systems.

Historical data query : allow access to static data and archived records since the bank’s inception.

Unstructured data handling : store and query files, images, video, and audio.

Additional design requirements include:

Data time range covering real‑time (T day) and historical (up to 3 years) data.

Performance targets: millisecond‑level response for real‑time queries, minute‑level for historical analysis.

Support for both structured and unstructured data types.

Both real‑time message queries and batch file queries.

Strict adherence to data standards for consistency.

3. Overall Architecture of the Hybrid Data Warehouse

The architecture consists of four modules: Real‑time Data Warehouse, Historical Data Warehouse, Archive Data Warehouse, and a Data Bus that provides a unified query interface for all modules.

Hybrid Data Warehouse Overall Architecture

4. Detailed Design and Implementation

4.1 Real‑time Data Warehouse

Uses Redis (a key‑value in‑memory database) for transaction details and lightweight aggregates, with MySQL for end‑of‑day reconciliation. Redis runs in a master‑slave cluster with AOF persistence for crash recovery. The application layer provides data‑service pools, transaction handling, batch processing, SQL adapters, and an admin console.

Key‑value model: each relational row becomes a Redis key like Detail:Table:PK1||PK2 whose value is a hash of column‑value pairs. Indexes are stored as separate keys, e.g., Index:Table:PK:Field:Value for strings and sorted‑sets for numeric fields.

4.2 Historical Data Warehouse

Built on traditional warehouse technology (e.g., Oracle RAC) and follows a four‑layer model because data standards are applied in source systems. ETL processes are scheduled centrally, and the warehouse stores standardized data for multi‑year analysis.

4.3 Archive Data Warehouse

Stores data older than three years and all unstructured assets (documents, audio, video). Implemented on a Hadoop cluster to provide storage, file handling, and distributed query capabilities.

4.4 Data Bus

The Data Bus acts like an Enterprise Service Bus, exposing unified APIs for real‑time, historical, and archive queries. It includes external service interface, security, service control, messaging, data access, traffic control, and common configuration modules.

5. Applications of the Hybrid Data Warehouse

Implemented in 2015‑2016, the hybrid architecture delivered a real‑time data bus, real‑time warehouse, and historical warehouse. Standardized data conversion at source systems simplified downstream development, reduced impact of source‑system schema changes, and lowered project complexity.

In mobile banking, the real‑time warehouse powers an instant full‑asset view for customers, enabling rapid balance updates without overloading core systems.

The hybrid approach expands data‑warehouse services beyond post‑transaction reporting to support pre‑ and in‑transaction data needs, thereby increasing the overall value of banking data.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Redis Data Warehouse Real-time Data Hybrid Architecture Hadoop banking

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.