Big Data 17 min read

Evolution of OLAP with Apache Doris at Xingyun Retail Credit

Facing rapid data growth, Xingyun Retail Credit transitioned from traditional OLTP systems to an Apache Doris‑based OLAP solution, detailing the data demand generation, OLAP engine selection challenges, multi‑stage implementation, performance gains, data‑warehouse construction, and future roadmap for scalable analytics.

DataFunSummit

Feb 7, 2024

Evolution of OLAP with Apache Doris at Xingyun Retail Credit

As business scale and data volume grew, traditional data warehouses could no longer meet Xingyun Retail Credit's analytical needs, prompting the team to explore an OLAP solution based on Apache Doris for more efficient and accurate data processing.

Main sections covered:

Data demand generation

OLAP engine selection challenges

Apache Doris practice

Future planning

Q&A

1. Data demand generation – The product evolution moved from single‑transaction centers to distributed micro‑service applications, requiring a shift from isolated OLTP systems (MySQL, Oracle, PostgreSQL) to a unified analytical platform.

2. OLAP selection challenges – Existing OLTP systems created data islands; traditional tools like Elasticsearch, Redis, and Hadoop‑based stacks could not handle the scale or cost requirements, leading to a careful evaluation of alternatives.

3. Apache Doris practice

Three implementation stages were described:

Stage 1 – Offline ETL : Used Kettle for data extraction and reporting, but suffered from high latency and lack of real‑time joins.

Stage 2 – Trino investigation : Leveraged Trino for heterogeneous source federation, yet faced memory‑intensive point‑query overheads.

Stage 3 – Doris adoption : Chosen for its integrated storage‑compute architecture, ISO‑SQL compatibility, and ability to handle both point‑queries and high‑throughput aggregation, solving the problems of the previous stages.

Key Doris practices included:

Accelerating concurrent queries by selecting appropriate models (Unique and detail), designing partitioning and colocation joins, and using high‑cardinality Bloom filters.

Building a data‑warehouse foundation with Dolphin Scheduler, DataX, JDBC catalog, and Flink CDC for both batch (T+1) and near‑real‑time ingestion.

Implementing monitoring via Grafana, Prometheus, and Loki, and achieving up to 4× performance improvement after upgrading from Doris 0.14 to 1.2.4.

Exploring JSONB storage for log and user‑behavior data, reducing storage size by ~70%.

Images illustrating architecture and performance were included throughout the presentation.

4. Future planning – Plans involve developing an intelligent data gateway for heterogeneous sources, unified archival of historical data using Doris, and further reducing operational overhead through Doris's built‑in high‑availability mechanisms.

5. Q&A

Questions covered log fuzzy search performance (millisecond‑level responses with Doris 1.2.4), refresh intervals for risk‑control dashboards (minute‑level), and high‑availability strategies (Doris internal HA and triple‑replica storage).

Overall, the case study demonstrates how Apache Doris enabled scalable, low‑latency analytics for fintech workloads while simplifying operations and reducing costs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization Big Data data-warehouse OLAP FinTech Apache Doris

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.