Databases 11 min read

Choosing Appropriate Redis Data Structures for Large‑Scale Statistics: Cardinality, Sorting, and Aggregation

This article explains how to select Redis data structures such as Bitmap, HyperLogLog, Set, List, Sorted Set, and Hash to efficiently handle massive statistical scenarios like user login status, UV counting, ranking, and set aggregation, while providing concrete command examples and best‑practice recommendations.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Choosing Appropriate Redis Data Structures for Large‑Scale Statistics: Cardinality, Sorting, and Aggregation

Cardinality Statistics

For counting unique elements (e.g., UV), using a plain Set becomes memory‑intensive at millions of users, so Redis offers the probabilistic HyperLogLog structure, which provides ~0.81% error with a fixed 12KB memory footprint regardless of cardinality.

Typical workflow: PFADD key userID1 userID2 ... to add IDs, then PFCOUNT key to retrieve the approximate unique count. Multiple HyperLogLog structures can be merged with PFMERGE dest source1 source2 ... , and the merged result reflects the union of the original sets.

Sorted Statistics

Redis provides ordered collections: List (insertion order) and Sorted Set (score‑based order). Lists are suitable for simple recent‑item feeds, while Sorted Sets are ideal for leaderboards where scores (e.g., play counts) change frequently.

Examples:

Insert a comment at the head of a list with LPUSH key comment and retrieve a range using LRANGE key start stop .

Maintain a music ranking by adding songs with ZADD musicTop score song , incrementing scores via ZINCRBY musicTop 1 song , and fetching the top N with ZREVRANGE musicTop 0 N‑1 WITHSCORES .

Aggregation Statistics

Redis Set operations enable intersection, union, and difference calculations, useful for scenarios such as finding common friends, daily new users, or total new users across days.

Examples:

Common friends: SINTERSTORE dest setA setB .

Daily new users: SDIFFSTORE newUsers day2Set day1Set .

Total new users over two days: SUNIONSTORE totalNew day1Set day2Set .

Because set aggregation can be costly on large datasets, it is recommended to offload these calculations to a dedicated Redis cluster or perform them on the client side after fetching the raw data.

Additional Data Types

The article also mentions using Bitmap for bit‑level statistics and HyperLogLog for approximate cardinality, extending beyond the five basic Redis types.

HyperLogLogRedisBitmapdata structuressortingAggregationCardinality
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.