Tagged articles
23 articles
Page 1 of 1
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 12, 2022 · Big Data

How Hologres Transformed a Real‑Time Data Warehouse: Cutting Costs & Boosting Performance

This case study details how an online education platform migrated its real‑time data warehouse from Kudu to Alibaba Cloud Hologres, overcoming technical bottlenecks, reducing operational costs by nearly a million dollars annually, and achieving higher throughput, lower latency, and easier maintenance across multiple business scenarios.

Cost reductionHologresKudu
0 likes · 16 min read
How Hologres Transformed a Real‑Time Data Warehouse: Cutting Costs & Boosting Performance
21CTO
21CTO
Oct 6, 2021 · Big Data

Building a Real-Time TB-Scale Bill Query System with Kafka, Kudu, and Presto

This article details the design and implementation of a real‑time, TB‑scale bill‑detail query platform that leverages Kafka for streaming, Debezium and Confluent Platform for change capture, Kudu for low‑latency storage, and Presto/Kylin for fast OLAP queries, while outlining deployment, integration, and future enhancements.

KafkaKuduPresto
0 likes · 19 min read
Building a Real-Time TB-Scale Bill Query System with Kafka, Kudu, and Presto
Architect
Architect
Oct 6, 2021 · Big Data

Design and Implementation of a Real-time and Offline Integrated Query System

This article details the requirements, architecture, and implementation of a real-time and offline integrated query system, covering data ingestion via Debezium and Confluent Platform, storage in Kudu and HDFS, query engines Presto and Kylin, and strategies for data synchronization, partitioning, and scaling.

Big DataDebeziumKafka
0 likes · 19 min read
Design and Implementation of a Real-time and Offline Integrated Query System
Big Data Technology & Architecture
Big Data Technology & Architecture
Aug 10, 2021 · Databases

Kudu Overview: Architecture, Features, and Use Cases

Kudu is an open‑source columnar storage engine from Cloudera that combines high‑throughput batch processing with low‑latency random reads, offering features such as C++/Java APIs, Raft‑based replication, flexible consistency, partitioning, and integration with Hadoop, Spark, Impala, and other ecosystem components.

Columnar StorageHadoopKudu
0 likes · 64 min read
Kudu Overview: Architecture, Features, and Use Cases
Big Data Technology & Architecture
Big Data Technology & Architecture
Dec 8, 2020 · Big Data

Horizontal Comparison of HBase, Kudu, and ClickHouse (V2.0)

This article provides a comprehensive technical comparison of HBase, Kudu, and ClickHouse—covering installation dependencies, architecture, basic read/write and query operations, real‑world use cases at Didi, a Kudu‑based real‑time data warehouse, and ClickHouse log‑analysis practices—highlighting each system’s strengths and trade‑offs for big‑data workloads.

HBaseKuduclickhouse
0 likes · 17 min read
Horizontal Comparison of HBase, Kudu, and ClickHouse (V2.0)
DataFunTalk
DataFunTalk
Sep 1, 2020 · Big Data

NetEase Real-Time Computing Platform (Sloth): Architecture, Practices, and Future Outlook

This article introduces NetEase's real-time computing platform Sloth, detailing its architecture, component layers, integrated IDE, operational tooling, unified metadata management, challenges such as Kudu write amplification, and proposes a tiered real‑time data‑warehouse model with a vision for storage‑compute separation and unified batch‑stream APIs.

Big DataFlinkKafka
0 likes · 13 min read
NetEase Real-Time Computing Platform (Sloth): Architecture, Practices, and Future Outlook
Big Data Technology Architecture
Big Data Technology Architecture
Jun 16, 2020 · Big Data

Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations

This article describes how Kuaishou leverages Apache Flink for large‑scale real‑time multi‑dimensional analytics, details the architecture of its analytics platform using Kudu storage and KwaiBI, and introduces SlimBase—a lightweight, embedded shared state backend that replaces RocksDB to reduce I/O, latency, and CPU overhead.

FlinkKuaishouKudu
0 likes · 17 min read
Real-time Multi-dimensional Analytics and SlimBase State Backend at Kuaishou: Flink Applications and Optimizations
Big Data Technology Architecture
Big Data Technology Architecture
Feb 3, 2020 · Big Data

NetEase Data Foundation Platform Construction – Technical Sharing

This article, originally shared by NetEase’s data expert Jiang Hongxiang on DataFun, outlines the construction of NetEase’s data foundation platform, covering database kernel insights and the implementation of the ad‑hoc query engine Impala with the distributed storage system Kudu, offering valuable big‑data engineering practices.

Data PlatformImpalaKudu
0 likes · 4 min read
NetEase Data Foundation Platform Construction – Technical Sharing
DataFunTalk
DataFunTalk
Jan 16, 2019 · Big Data

NetEase Data Infrastructure: Database Technologies and Big Data Platform Overview

This article presents NetEase Hangzhou Research Institute's experience in building a data infrastructure, covering database innovations such as InnoSQL, NTSDB, and InnoRocks, as well as the integration of big‑data components like HDFS, Spark, Impala, and Kudu to enable efficient storage, processing, and real‑time analytics.

Data PlatformImpalaInnoSQL
0 likes · 12 min read
NetEase Data Infrastructure: Database Technologies and Big Data Platform Overview
Meituan Technology Team
Meituan Technology Team
Aug 5, 2016 · Big Data

Design and Implementation of a Large-Scale User Behavior Analytics Platform

The article outlines Meituan‑Dianping’s “Sensors Analytics” platform, a privately‑deployed, open‑PaaS solution that collects full‑stack user events from iOS, Android, Web and WeChat, maps IDs in near real‑time, stores detailed records in Kudu (real‑time) and Parquet (offline), and serves low‑latency queries via Impala, addressing the architectural and operational challenges of high‑throughput ingestion and data‑security requirements.

ImpalaKafkaKudu
0 likes · 8 min read
Design and Implementation of a Large-Scale User Behavior Analytics Platform