How SelectDB Powers a Next‑Gen Real‑Time Data Warehouse for Banks
The article analyses the structural mismatch between banks' demand for instant data‑driven decisions and their legacy, component‑heavy architectures, then details how SelectDB’s high‑performance, unified MPP engine, schema‑change agility, and financial‑grade security enable a real‑time data‑warehouse platform that supports diverse banking scenarios from dashboards to fraud detection.
Introduction
Banking is at a pivotal stage of moving from basic information systems to deep digitalization. Precise marketing, real‑time risk control, and agile decision‑making require a data foundation that can deliver sub‑second response times. The speaker, solution architect Wang Yujia, presents a construction path for a next‑generation real‑time data warehouse built on SelectDB.
Product Capabilities – Core Features of the New Real‑Time Data Warehouse
SelectDB is a commercial product derived from the open‑source Apache Doris project, combining open‑source innovation with enterprise support. Its product forms include a cloud‑native fully managed service (SelectDB Cloud), an Alibaba Cloud marketplace offering, and an on‑premise enterprise edition for financial and state‑owned enterprises. The core competitive advantages are:
Extreme performance : end‑to‑end low‑latency from data ingestion to query response, supporting second‑level bulk ingestion and real‑time visibility.
Simplified data architecture : lightweight schema‑change mechanism enables millisecond‑level column add/drop without affecting online services.
Financial‑grade security : deep defense layers, LDAP/Kerberos authentication, Ranger‑based unified authorization, row‑level and column‑level access control, and support for national cryptographic algorithms and AES encryption with transparent file‑level encryption.
Benchmark results on ClickBench, TPC‑H and TPC‑DS show that SelectDB outperforms traditional architectures in single‑table massive data retrieval, multi‑table complex joins, and high‑concurrency ad‑hoc queries.
Industry Analysis – Structural Contradiction Between Real‑Time Business Demands and Legacy Architecture
Banks face a unified demand for "decision‑in‑process" rather than post‑mortem analysis. Rapid data growth, regulatory compliance, and the need for second‑level response across asset‑management, wealth‑management, and retail divisions expose the limitations of batch‑oriented T+1 pipelines. Existing heterogeneous stacks (HBase, Elasticsearch, Greenplum, ClickHouse, Kylin, Redis, etc.) create technical debt:
Latency gaps caused by offline batch layers that cannot deliver end‑to‑end real‑time data.
Flexibility bottlenecks: HBase lacks flexible query, Elasticsearch cannot update efficiently, leading to data redundancy and slow audience segmentation.
Solution – Unified Real‑Time Data Platform Based on SelectDB
The proposed architecture consolidates scattered components into a single unified data service layer that supports both offline and real‑time workloads ("offline + real‑time"). Key design points:
Multi‑source ingestion and unified service : Core transaction systems use CDC to stream incremental changes to Kafka; Flink performs lightweight processing and writes directly to SelectDB, ensuring low‑latency for fraud detection and real‑time dashboards. Historical and archival data are loaded offline into a data lake or warehouse, where SelectDB can act as a compute engine or an acceleration layer.
Ecosystem integration : Seamless compatibility with BI tools, and deep integration with Flink, Spark, and other big‑data components. Multi‑replica storage and compute‑storage separation provide fine‑grained resource isolation and cross‑region disaster recovery.
Business Acceleration – Typical Scenarios Co‑Supported by DataMind and SelectDB
DataMind supplies over 50 pre‑built business‑scenario templates, while SelectDB delivers the high‑performance analytical engine. Representative use cases include:
Operating dashboards and management reports : Standard MySQL protocol and SQL compatibility enable BI tools to switch from pre‑aggregated to lightweight real‑time calculations, supporting instant drill‑down from summary metrics to transaction‑level details.
Metric analysis platform : The built‑in CBO optimizer rewrites complex nested SQL generated by low‑code Data Agent, ensuring fast response for self‑service analytics and AI‑driven attribution analysis.
User tagging and precise marketing : SelectDB’s column‑store and Bitmap index achieve millisecond‑level tag point‑lookup on billions of rows, supporting high‑concurrency marketing and real‑time risk interception.
Credit risk control and anti‑fraud : Millisecond‑level writes and multi‑table joins enable instant device‑fingerprint matching and anomaly detection, shifting fraud prevention from post‑event audit to real‑time interruption.
Self‑service analytics for the whole bank : Federated queries allow analysts to combine historical lake data with real‑time account information without data movement, while Workload Groups allocate isolated compute resources per department.
Intelligent operations and observability : High‑throughput ingestion (tens of GB/s) and high compression store logs, metrics, traces, and logs; integration with Logstash and Kafka enables three‑dimensional correlation analysis, and token‑level tracking of LLM calls supports fine‑grained AI cost governance.
Regulatory reporting : MPP‑based join acceleration reduces multi‑billion‑row association time from hours to minutes, providing a fast, accurate data pipeline for strict regulatory submissions (EAST, 1041).
Classic Practice – Bank‑Level Deployment Cases
Several leading banks have validated SelectDB’s value:
A state‑owned large bank replaced a complex heterogeneous OLAP stack with SelectDB, consolidating unified reporting, BI analysis, and fraud detection onto a single data foundation, dramatically improving multi‑table join efficiency and reducing operational costs.
A rural commercial bank used SelectDB as the acceleration engine for its operating analysis platform, achieving a shift from offline batch processing to a unified stream‑batch model, and reducing response time for flexible tag selection and metric drill‑down from minutes to seconds.
Overall, the simplified architecture and extreme performance of SelectDB help financial institutions overcome the gap between real‑time business needs and cumbersome legacy stacks, enabling a data‑driven transformation.
Round‑Table Dialogue
The discussion highlighted three pain points: misalignment between business and technology, coordination challenges in large R&D teams, and the tension between open data access and strict security/compliance. Participants emphasized that the core requirement is a unified, high‑performance, secure data platform that can support instant decision‑making while meeting regulatory constraints.
Key takeaways include the importance of choosing the right architecture, integrating AI for data governance, and avoiding over‑complexity in the stack.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
