BitBase: An HBase‑Based Solution for Billion‑Scale User Feature Analysis at Kuaishou
This article describes how Kuaishou built BitBase on HBase to store and compute billions of user feature logs with millisecond‑level latency, covering business requirements, technical selection, bitmap data modeling, system architecture, device‑ID handling, performance results, and future roadmap.
Kuaishou has been using HBase for about two years in various scenarios such as short‑video storage, IM, and live‑stream comment feeds. This talk focuses on one specific use case: applying HBase to analyze and serve user feature data at the hundred‑billion level.
Business Requirements and Challenges
The goal is to compute retention metrics (7‑90 days) across any combination of dimensions (city, gender, interests, etc.) on logs that reach the hundred‑billion scale, with a response time of 1‑2 seconds for analysts.
Massive log volume (hundreds of billions)
Arbitrary multi‑dimensional queries
Sub‑second latency requirements
Technical Selection
Three alternatives were evaluated:
Hive – easy SQL but minute‑level latency
Elasticsearch – good for inverted indexes but slower for exact deduplication
ClickHouse – fast for analytics but still >10 s on small clusters
Because none satisfied the latency and flexibility needs, a custom solution named BitBase was designed on top of HBase.
BitBase Solution
Data Model
Raw data values are abstracted into bitmaps (e.g., city=\"bj\" becomes 10100). Multi‑dimensional queries are reduced to bitmap logical operations (AND, OR, XOR) followed by a count of set bits, which yields the user IDs matching the criteria.
Architecture
The system consists of five components:
Data storage – bitmap indexes and dictionary archives
Data conversion – batch (mrjob) or online ingestion
Computation – scheduling and execution, returning results to the client
Client – business‑level APIs
Zookeeper – distributed coordination
Storage Module
Bitmaps are split into meta information (identifying db, table, event, entity, version) and data blocks (the actual bit arrays). Three HBase tables store BitmapMeta, BlockData, and BlockMeta.
Computation Module
The workflow involves BitBase Client → BitBase Server → HBase RegionServer. The server parses the bitmap meta, splits the expression into sub‑expressions, routes them (local coprocessor or remote servers), aggregates results, and returns them. Local computation is 3‑5× faster than non‑local.
DeviceId Problem and Solution
To support DeviceId, a three‑table mapping (meta, index→DeviceId, DeviceId→index) is built using a two‑phase commit in HBase, ensuring continuity, consistency, reversibility, and fast conversion. Archiving and MRJob‑based joins accelerate bulk conversion.
Business Effect
Benchmarks show that latency does not increase with the number of dimensions because irrelevant bitmap blocks are skipped. BitBase delivers sub‑second response for multi‑dimensional retention analysis across billions of rows.
Future Plans
Upcoming work includes real‑time aggregation (<5 min latency), SQL‑style query support, and open‑sourcing the project to foster community contributions.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.