Big Data 9 min read

Real-Time OLAP Evolution and Production Optimization at BTC.com

This article details BTC.com’s journey from a legacy batch‑oriented analytics stack to a modern real‑time OLAP architecture using Flink, ClickHouse, Kafka, and Kubernetes, highlighting the business drivers, technical choices, architectural evolution, optimizations, and future directions.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Real-Time OLAP Evolution and Production Optimization at BTC.com

The BTC.com team presents the technical evolution and production optimization of their real‑time OLAP platform, outlining the motivations, challenges, and solutions implemented.

BTC.com, a blockchain technology provider, offers services across AI, blockchain, cloud, and data domains, requiring robust OLAP capabilities for security monitoring, transaction analysis, and law‑enforcement support.

Earlier (circa 2018) the architecture relied on blockchain nodes feeding parsers into MySQL, then Hive/Presto and Spark for batch jobs, which suffered from lack of real‑time processing, single‑point failures, and low query efficiency.

To address these issues, the team selected PyFlink for low‑latency streaming with flexible windows, ClickHouse for high‑performance queries, and deployed components on Kubernetes for scalability and high availability.

The evolved architecture streams data from blockchain nodes through a parser into Kafka, then processes it with Flink and Spark, writing results to MySQL and ClickHouse, supporting dashboards, reports, data synchronization, and OLAP analytics, while applying a layered data governance model.

Optimizations include custom sinks, batch import strategies for cold data, upsert handling via temporary tables, Kubernetes‑based high‑availability storage, horizontal scaling, service discovery customizations, and monitoring using Prometheus.

Future plans involve expanding business scope, integrating more AI/ML workloads, contributing to open‑source Flink and ClickHouse ecosystems, pursuing real‑time online training, and further enhancing the platform’s reliability and performance.

Data EngineeringFlinkKafkaClickHouseReal-time OLAPblockchain
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.