JD Tech Talk
JD Tech Talk
Dec 26, 2024 · Databases

Using ClickHouse for Efficient Tag Bitmap Storage and Group Computation in a CDP

This article explains how ClickHouse’s columnar storage, bitmap functions, and distributed architecture can be leveraged to store billions of tag bitmaps, combine them efficiently, and support fast group calculations for customer data platforms, while addressing data‑warehouse integration, storage format, and performance challenges.

BitmapColumnar StorageOLAP
0 likes · 10 min read
Using ClickHouse for Efficient Tag Bitmap Storage and Group Computation in a CDP
Big Data Technology & Architecture
Big Data Technology & Architecture
Nov 22, 2022 · Big Data

Comprehensive Guide to Metadata Management, Data Quality, and Optimization in Big Data Systems

This article provides an in-depth overview of metadata concepts, their technical and business classifications, value in data management, applications such as data profiling and lineage, optimization techniques for compute and storage, lifecycle management, and comprehensive data quality assurance practices within large‑scale big data environments.

Optimizationbig-datadata-quality
0 likes · 38 min read
Comprehensive Guide to Metadata Management, Data Quality, and Optimization in Big Data Systems
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 5, 2022 · Big Data

Scaling Alibaba TCC to Millions of RPS with a High‑Availability Real‑Time Data Warehouse

This article details how Alibaba's TCC platform evolved its architecture over multiple phases—from a legacy database to a high‑availability real‑time data warehouse built on Flink and Hologres—highlighting the challenges, solutions, and cost‑saving measures that enabled millions of RPS, terabytes of storage, and sub‑second query latency.

FlinkHologresbig-data
0 likes · 21 min read
Scaling Alibaba TCC to Millions of RPS with a High‑Availability Real‑Time Data Warehouse
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 4, 2022 · Big Data

Boost Real‑Time Data Warehouses with Integrated Analytics & Service

Alibaba Cloud’s Hologres unifies analytical and service workloads in a real‑time data warehouse, simplifying data exchange, reducing development and operational costs, and delivering high‑performance, low‑latency online services through innovations like row‑column hybrid storage, hot upgrades, and elastic cloud‑native scaling, as demonstrated in a logistics case study.

FlinkHologrescloud-native
0 likes · 13 min read
Boost Real‑Time Data Warehouses with Integrated Analytics & Service
StarRocks
StarRocks
May 19, 2022 · Big Data

How StarRocks Boosted MaFengWo’s OLAP Performance by 4×

MaFengWo’s data platform replaced Kylin, Presto, and Druid with StarRocks, redesigning its four‑layer architecture, unifying metadata, and optimizing single‑table, multi‑table, and precise‑deduplication queries, which cut query latency by four times, reduced storage by 87%, and lowered operational complexity.

KylinPerformancebigdata
0 likes · 15 min read
How StarRocks Boosted MaFengWo’s OLAP Performance by 4×
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 11, 2022 · Big Data

Real-Time Data Warehouse Construction: Background, Objectives, Architecture, and Case Studies

This article explains the growing demand for real‑time data warehouses, outlines their objectives and layered architecture, and presents detailed case studies from Didi, Kuaishou, Tencent, Youzan and others, illustrating design choices, implementation challenges, and best practices for building scalable streaming data platforms.

ClickHouseFlinkKafka
0 likes · 48 min read
Real-Time Data Warehouse Construction: Background, Objectives, Architecture, and Case Studies