Big Data 16 min read

Meitu Distributed Bitmap System (Naix): Architecture, Implementation, and Performance Evaluation

Meitu’s Naix distributed bitmap system accelerates massive user‑data analytics by using a three‑layer architecture, sharded RoaringBitmap storage, and PalDB, delivering over 600× faster queries than Hive, supporting fast generation plugins, fault‑tolerant replication, and millisecond‑level RPC query responses while reducing storage by 67%.

Meitu Technology
Meitu Technology
Meitu Technology
Meitu Distributed Bitmap System (Naix): Architecture, Implementation, and Performance Evaluation

Meitu's Internet technology salon introduced a distributed Bitmap solution called Naix, which addresses the need for fast, space‑efficient computation on massive user data. Bitmap, a bit‑array based data structure, provides high‑speed set operations (AND, OR, ANDNOT) and compact storage, making it suitable for large‑scale analytics such as user activity, retention, and cross‑dimensional queries.

Compared with traditional Hive‑based statistics, Naix's Bitmap operations reduced a 139‑second Hive job to 226 ms, a speed‑up of more than 600×, while using a single‑node process instead of a 4‑node Hadoop cluster.

The Naix system is organized into three logical layers (see Figure 4):

External Call Layer : includes a generator that converts raw data (e.g., MySQL records, HDFS files) into Bitmap format, and a TCP client for application interaction.

Core Node Layer : consists of a Master node for cluster management, Transport nodes for query routing, and Data Nodes (using PalDB) that store the actual Bitmap data.

External Storage Layer : relies on MySQL for metadata and Redis for caching.

Data is organized into index groups (analogous to databases) and indexes (analogous to tables). Each index stores Bitmap files for specific dimensions (e.g., version, channel, region) and time slices.

Naix provides several plugins for Bitmap generation:

Simple plugin : converts HDFS‑derived data to Bitmap.

MapReduce plugin : accelerates generation from hours to minutes on a 4‑node cluster.

Bitmap‑to‑Bitmap plugin : creates periodic Bitmap (daily → weekly, weekly → monthly) automatically.

To handle petabyte‑scale data, Naix adopts a sharding strategy: Bitmap data are split into fixed‑width shards, each replicated across multiple nodes. This design solves storage distribution, parallel computation, data copy, serialization overhead, and int‑range limitations.

Replication is performed at the index‑group level, ensuring fault tolerance (see Figure 8).

Space‑saving optimizations include switching from EWAH compression to RoaringBitmap, achieving a 67.3% reduction in storage and a 58% reduction in processing time. Additional storage‑layer experiments evaluated Redis, HBase, RocksDB, and PalDB, with PalDB delivering the best read‑only performance for Naix's workload.

Query execution follows a RPC model built on Netty and Protocol Buffers. Clients send requests to Transport nodes, which dispatch them to the appropriate shards; results are aggregated and returned in milliseconds for simple queries and seconds to minutes for full cross‑dimensional analyses (see Figures 10‑12).

Future work includes expanding operational tools, further query‑performance optimizations, and adding SQL‑style query support to lower the learning curve for new users.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Big DataBitmapdata storageNaix
Meitu Technology
Written by

Meitu Technology

Curating Meitu's technical expertise, valuable case studies, and innovation insights. We deliver quality technical content to foster knowledge sharing between Meitu's tech team and outstanding developers worldwide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.