Tagged articles
19 articles
Page 1 of 1
StarRocks
StarRocks
Aug 19, 2025 · Big Data

How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey

Facing daily data growth from millions to 150 billion records, Joydata‑U transformed its analytics platform through three architectural stages—Hadoop, Hadoop + Trino, and finally StarRocks—introducing resource isolation, Flat JSON acceleration, and Bitmap indexing to cut query latency by up to seven times and achieve sub‑2‑minute data freshness across BI, ad‑tech, game analytics, and CRM workloads.

Bitmap IndexData ArchitectureFlat JSON
0 likes · 12 min read
How Joydata Scaled to 150 Billion Daily Events with StarRocks: A Data Architecture Journey
Aikesheng Open Source Community
Aikesheng Open Source Community
Apr 28, 2024 · Databases

Database Indexing Algorithms: B‑Tree vs Hash Indexing

This article explains the purpose and inner workings of various database indexing algorithms—including B‑Tree, Hash, Bitmap, and Full‑Text indexes—illustrates their strengths and weaknesses with SQL examples, and provides guidance on when to choose each type for optimal query performance.

B+TreeBitmap IndexFull‑Text Search
0 likes · 12 min read
Database Indexing Algorithms: B‑Tree vs Hash Indexing
DaTaobao Tech
DaTaobao Tech
Sep 6, 2023 · Big Data

Accelerating User Profile Analysis with Hologres RoaringBitmap

The article explains how Hologres RoaringBitmap compresses user ID sets into efficient bitmap indexes, splits 64‑bit IDs into buckets, syncs them from MaxCompute, and enables sub‑second user portrait queries that previously took minutes, dramatically improving performance and scalability.

Bitmap IndexHologresRoaringBitmap
0 likes · 18 min read
Accelerating User Profile Analysis with Hologres RoaringBitmap
StarRocks
StarRocks
Jun 29, 2023 · Big Data

How StarRocks Boosted Mango TV’s Data Platform Performance by Over 10×

Mango TV replaced its fragmented EMR‑Hive‑Kudu‑Presto stack with a unified StarRocks lakehouse, simplifying architecture, cutting operational costs, and achieving more than a ten‑fold increase in query speed while supporting real‑time analytics, materialized views, bitmap indexing, and store‑compute separation.

Big DataBitmap IndexMaterialized Views
0 likes · 14 min read
How StarRocks Boosted Mango TV’s Data Platform Performance by Over 10×
Huolala Tech
Huolala Tech
Oct 13, 2022 · Big Data

How Druid Uses Bitmap Indexes for Fast Queries and Precise Deduplication

This article explains how Apache Druid builds and queries bitmap indexes for efficient OLAP analysis, and describes a dictionary‑encoding plus bitmap solution—adapted from Kuaishou—to achieve exact deduplication even on high‑cardinality dimensions.

Bitmap IndexDictionary EncodingDruid
0 likes · 14 min read
How Druid Uses Bitmap Indexes for Fast Queries and Precise Deduplication
DataFunTalk
DataFunTalk
Aug 1, 2022 · Big Data

Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices

This article details Bilibili's lakehouse implementation using Apache Iceberg and Alluxio, covering background challenges, architectural components, data organization techniques like Z‑order and bitmap indexes, performance benchmarks, and future optimization plans for large‑scale analytics.

AlluxioBitmap IndexIceberg
0 likes · 21 min read
Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices
Bilibili Tech
Bilibili Tech
Jul 15, 2022 · Big Data

Lakehouse Architecture Practice at Bilibili: Query Acceleration and Index Enhancement

Bilibili’s lakehouse architecture merges Iceberg‑based data lake flexibility with data‑warehouse efficiency, using Kafka‑Flink real‑time ingestion, Spark offline loads, Trino queries, Alluxio caching, Z‑Order/Hilbert sorting, and enhanced BloomFilter and bitmap indexes to boost query speed up to tenfold while drastically cutting file reads.

Big Data ArchitectureBitmap IndexData Lake
0 likes · 17 min read
Lakehouse Architecture Practice at Bilibili: Query Acceleration and Index Enhancement
Shopee Tech Team
Shopee Tech Team
Jan 13, 2022 · Big Data

Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee

Shopee’s engineering team scaled a 100‑node Apache Druid cluster for real‑time OLAP by redesigning the Coordinator load‑balancing algorithm, adding incremental metadata pulls, introducing a segment‑merged result cache, and building exact‑count and flexible sliding‑window operators, while planning cloud‑native deployment.

Apache DruidBig DataBitmap Index
0 likes · 17 min read
Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee
Baidu Geek Talk
Baidu Geek Talk
Nov 15, 2021 · Backend Development

Baidu Short Video Push System: Architecture Design and Billion-Level Data Optimization Practice

Baidu’s Short Video Push System is a distributed platform serving hundreds of millions of users across multiple apps, delivering personalized, real‑time notifications via a modular architecture that includes material and user centers, recall, preprocessing, and delivery services, while optimizations such as activity‑based scheduling, bitmap‑based user segmentation, consistent‑hash frequency control, and protobuf compression boost click‑through rates, scalability, and resource efficiency.

BaiduBitmap IndexProtobuf
0 likes · 15 min read
Baidu Short Video Push System: Architecture Design and Billion-Level Data Optimization Practice
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Sep 26, 2021 · Databases

How ClickHouse Powers a Billion‑User Profiling Platform at Sub‑5‑Second Latency

This article shares NetEase’s experience building a user‑profile platform with ClickHouse, detailing the business background, challenges of massive data and complex queries, core table designs, data ingestion, bitmap techniques, performance gains, and future plans for scaling and optimization.

Bitmap IndexClickHouseReal-time analytics
0 likes · 13 min read
How ClickHouse Powers a Billion‑User Profiling Platform at Sub‑5‑Second Latency
Volcano Engine Developer Services
Volcano Engine Developer Services
Jul 14, 2021 · Databases

How ByteDance Scales Ad Targeting with ClickHouse: Architecture & Optimizations

This article explains how ByteDance leverages ClickHouse for ad audience estimation, profiling, and analytics, detailing the challenges of massive user‑level set operations, the evolution from a simple tag‑uid table to Bitmap64 with RoaringBitmap, and the extensive engineering optimizations that cut query latency, storage, and CPU usage dramatically.

Ad TargetingBitmap IndexClickHouse
0 likes · 22 min read
How ByteDance Scales Ad Targeting with ClickHouse: Architecture & Optimizations
DataFunTalk
DataFunTalk
Aug 11, 2020 · Databases

Applying ClickHouse for Real‑Time Advertising Audience Estimation at ByteDance

This article details how ByteDance leverages ClickHouse to power large‑scale advertising audience estimation, profiling, and statistical analysis, describing the challenges of massive data, strict latency requirements, and the evolution from a simple tag‑uid table to a bitmap‑based architecture with extensive parallel and cache optimizations.

Audience EstimationBitmap IndexClickHouse
0 likes · 21 min read
Applying ClickHouse for Real‑Time Advertising Audience Estimation at ByteDance
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 12, 2019 · Big Data

iQIYI's Big Data Architecture Evolution and Adoption of Druid

iQIYI upgraded its big‑data stack by adopting Druid as the core engine for free‑time queries and ElasticSearch for pre‑computed fixed‑time queries, overcoming early API, security and scaling challenges through monthly segment granularity, parallel sub‑queries, Redis caching and failover, cutting typical query latency from over two seconds to about 150 ms and reaching 99.9 % service success.

Bitmap IndexData ArchitectureElasticsearch
0 likes · 12 min read
iQIYI's Big Data Architecture Evolution and Adoption of Druid
21CTO
21CTO
Jun 11, 2017 · Databases

Can Pilosa Handle Dense Relational Data? A Deep Dive with NYC Taxi Dataset

Pilosa, originally built for sparse high‑cardinality user attributes, is evaluated on a dense, low‑cardinality NYC taxi dataset to see if it can serve as a general‑purpose index, with performance comparisons against Spark, PostgreSQL, Elasticsearch, and kdb+ across multiple query scenarios.

Bitmap IndexNYC Taxi DataPilosa
0 likes · 8 min read
Can Pilosa Handle Dense Relational Data? A Deep Dive with NYC Taxi Dataset