Databases 10 min read

Mastering Redis Set Operations for Scalable Statistics and Aggregations

This article explains how to leverage Redis data structures such as Set, Sorted Set, Bitmap, and HyperLogLog to perform aggregation, sorted, binary‑state, and cardinality statistics efficiently in large‑scale applications, while addressing performance considerations and practical implementation details.

dbaplus Community
dbaplus Community
dbaplus Community
Mastering Redis Set Operations for Scalable Statistics and Aggregations

1. Aggregation Statistics

Redis Sets are ideal for aggregating multiple collections, supporting intersection, difference, and union operations. Examples include finding common friends in a social app, latest comments in e‑commerce, and counting daily sign‑ins. Commands demonstrated:

SINTERSTORE userid:new userid:20002 userid:20003

This stores the intersection of two user‑friend sets into userid:new.

SDIFFSTORE user:new userid:20201102 userid:20201101

Difference yields newly added friends between two days.

SUNIONSTORE userid:new userid:20201102 userid:20201101

Union combines new friends from two days into a single set.

Because Set operations can be costly on massive data, the article suggests off‑loading aggregation to a replica or performing it on the client side.

2. Sorted Statistics

When order matters, Redis provides ordered collections: List (insertion order) and Sorted Set (score‑based order). Lists support LRANGE for pagination, while Sorted Sets use ZRANGEBYSCORE and allow custom weighting, making them more flexible for scenarios beyond simple time‑based ordering.

3. Binary State Statistics

For true/false flags such as sign‑in (1) or not (0), Redis Bitmap (a string‑based bit array) is memory‑efficient. Example workflow for a user’s monthly attendance: SETBIT userid:10001:202011 1 1 marks day 2 as signed in. To check the flag: GETBIT userid:10001:202011 1 To count total sign‑ins in the month: BITCOUNT userid:10001:202011 For continuous‑day analysis, multiple daily Bitmaps can be combined with bitwise AND, OR, XOR operations, then counted with BITCOUNT. The article estimates 1 × 10⁸ bits ≈ 12 MB per day, so 20 days ≈ 240 MB.

4. Cardinality Statistics

Counting unique elements (e.g., page UV) can be done with Sets, but memory usage grows quickly. Redis HyperLogLog offers approximate cardinality with ~12 KB for 2⁶⁴ elements, at ~0.81% error. Example commands:

PFADD p1:uv 10001 10002 10003 10004
PFCOUNT p1:uv

HyperLogLog automatically deduplicates; for exact counts, use Sets.

Conclusion

The article summarizes the strengths and limitations of each Redis data type for statistical use cases: Sets support intersection/union, Sorted Sets add score‑based ordering, Bitmaps excel at binary state tracking with minimal memory, and HyperLogLog provides ultra‑compact approximate cardinality. Choosing the right structure balances accuracy, performance, and memory consumption.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendHyperLogLogRedisstatisticsSetaggregation
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.