Tag

bitmap index

1 views collected around this technical thread.

Aikesheng Open Source Community
Aikesheng Open Source Community
Apr 28, 2024 · Databases

Database Indexing Algorithms: B‑Tree vs Hash Indexing

This article explains the purpose and inner workings of various database indexing algorithms—including B‑Tree, Hash, Bitmap, and Full‑Text indexes—illustrates their strengths and weaknesses with SQL examples, and provides guidance on when to choose each type for optimal query performance.

B-TreeFull-Text SearchHash Index
0 likes · 12 min read
Database Indexing Algorithms: B‑Tree vs Hash Indexing
DaTaobao Tech
DaTaobao Tech
Sep 6, 2023 · Big Data

Accelerating User Profile Analysis with Hologres RoaringBitmap

The article explains how Hologres RoaringBitmap compresses user ID sets into efficient bitmap indexes, splits 64‑bit IDs into buckets, syncs them from MaxCompute, and enables sub‑second user portrait queries that previously took minutes, dramatically improving performance and scalability.

HologresPerformanceRoaringBitmap
0 likes · 18 min read
Accelerating User Profile Analysis with Hologres RoaringBitmap
DataFunTalk
DataFunTalk
Aug 1, 2022 · Big Data

Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices

This article details Bilibili's lakehouse implementation using Apache Iceberg and Alluxio, covering background challenges, architectural components, data organization techniques like Z‑order and bitmap indexes, performance benchmarks, and future optimization plans for large‑scale analytics.

AlluxioData OptimizationZ-Order
0 likes · 21 min read
Bilibili Lakehouse Integration: Iceberg and Alluxio Optimization Practices
Bilibili Tech
Bilibili Tech
Jul 15, 2022 · Big Data

Lakehouse Architecture Practice at Bilibili: Query Acceleration and Index Enhancement

Bilibili’s lakehouse architecture merges Iceberg‑based data lake flexibility with data‑warehouse efficiency, using Kafka‑Flink real‑time ingestion, Spark offline loads, Trino queries, Alluxio caching, Z‑Order/Hilbert sorting, and enhanced BloomFilter and bitmap indexes to boost query speed up to tenfold while drastically cutting file reads.

Query OptimizationZ-Order sortingbig data architecture
0 likes · 17 min read
Lakehouse Architecture Practice at Bilibili: Query Acceleration and Index Enhancement
Shopee Tech Team
Shopee Tech Team
Jan 13, 2022 · Big Data

Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee

Shopee’s engineering team scaled a 100‑node Apache Druid cluster for real‑time OLAP by redesigning the Coordinator load‑balancing algorithm, adding incremental metadata pulls, introducing a segment‑merged result cache, and building exact‑count and flexible sliding‑window operators, while planning cloud‑native deployment.

Apache DruidBig DataCache
0 likes · 17 min read
Engineering Practices and Performance Optimizations of Apache Druid for Real‑Time OLAP at Shopee
Baidu Geek Talk
Baidu Geek Talk
Nov 15, 2021 · Backend Development

Baidu Short Video Push System: Architecture Design and Billion-Level Data Optimization Practice

Baidu’s Short Video Push System is a distributed platform serving hundreds of millions of users across multiple apps, delivering personalized, real‑time notifications via a modular architecture that includes material and user centers, recall, preprocessing, and delivery services, while optimizations such as activity‑based scheduling, bitmap‑based user segmentation, consistent‑hash frequency control, and protobuf compression boost click‑through rates, scalability, and resource efficiency.

BaiduProtobufPush Notification
0 likes · 15 min read
Baidu Short Video Push System: Architecture Design and Billion-Level Data Optimization Practice
Big Data Technology Architecture
Big Data Technology Architecture
Aug 13, 2020 · Databases

Deep Dive into Apache Druid V1 Storage Format: Index Structures and Disk Layout

This article provides a detailed analysis of Apache Druid V1's column‑oriented storage format, covering dimension dictionaries, variable‑length encoded values, bitmap inverted indexes, array handling, and the physical metadata layout that enables sub‑second OLAP queries on massive datasets.

Apache DruidOLAPStorage Format
0 likes · 8 min read
Deep Dive into Apache Druid V1 Storage Format: Index Structures and Disk Layout
DataFunTalk
DataFunTalk
Aug 11, 2020 · Databases

Applying ClickHouse for Real‑Time Advertising Audience Estimation at ByteDance

This article details how ByteDance leverages ClickHouse to power large‑scale advertising audience estimation, profiling, and statistical analysis, describing the challenges of massive data, strict latency requirements, and the evolution from a simple tag‑uid table to a bitmap‑based architecture with extensive parallel and cache optimizations.

Audience EstimationClickHouseDatabase Optimization
0 likes · 21 min read
Applying ClickHouse for Real‑Time Advertising Audience Estimation at ByteDance
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 12, 2019 · Big Data

iQIYI's Big Data Architecture Evolution and Adoption of Druid

iQIYI upgraded its big‑data stack by adopting Druid as the core engine for free‑time queries and ElasticSearch for pre‑computed fixed‑time queries, overcoming early API, security and scaling challenges through monthly segment granularity, parallel sub‑queries, Redis caching and failover, cutting typical query latency from over two seconds to about 150 ms and reaching 99.9 % service success.

Big DataDruidElasticsearch
0 likes · 12 min read
iQIYI's Big Data Architecture Evolution and Adoption of Druid
DataFunTalk
DataFunTalk
Jul 17, 2019 · Big Data

BitBase: An HBase‑Based Solution for Billion‑Scale User Feature Analysis at Kuaishou

This article describes how Kuaishou built BitBase on HBase to store and compute billions of user feature logs with millisecond‑level latency, covering business requirements, technical selection, bitmap data modeling, system architecture, device‑ID handling, performance results, and future roadmap.

BitBaseHBaseScalable storage
0 likes · 11 min read
BitBase: An HBase‑Based Solution for Billion‑Scale User Feature Analysis at Kuaishou
High Availability Architecture
High Availability Architecture
Jun 7, 2017 · Databases

Evaluating Pilosa on Dense, Low‑Cardinality Data Using the NYC Taxi Dataset

This article examines whether Pilosa, a bitmap index originally built for sparse high‑cardinality data, can efficiently handle dense relational datasets by benchmarking it against a billion‑row NYC taxi trip dataset and comparing query performance with other database systems.

NYC taxi datasetPilosabitmap index
0 likes · 6 min read
Evaluating Pilosa on Dense, Low‑Cardinality Data Using the NYC Taxi Dataset