Databases 10 min read

Mastering Redis Set Operations for Scalable Statistics and Aggregations

This article explains how to leverage Redis data structures such as Set, Sorted Set, Bitmap, and HyperLogLog to perform aggregation, sorted, binary‑state, and cardinality statistics efficiently in large‑scale applications, while addressing performance considerations and practical implementation details.

dbaplus Community

Aug 17, 2023

Mastering Redis Set Operations for Scalable Statistics and Aggregations

1. Aggregation Statistics

Redis Sets are ideal for aggregating multiple collections, supporting intersection, difference, and union operations. Examples include finding common friends in a social app, latest comments in e‑commerce, and counting daily sign‑ins. Commands demonstrated:

SINTERSTORE userid:new userid:20002 userid:20003

This stores the intersection of two user‑friend sets into userid:new.

SDIFFSTORE user:new userid:20201102 userid:20201101

Difference yields newly added friends between two days.

SUNIONSTORE userid:new userid:20201102 userid:20201101

Union combines new friends from two days into a single set.

Because Set operations can be costly on massive data, the article suggests off‑loading aggregation to a replica or performing it on the client side.

2. Sorted Statistics

When order matters, Redis provides ordered collections: List (insertion order) and Sorted Set (score‑based order). Lists support LRANGE for pagination, while Sorted Sets use ZRANGEBYSCORE and allow custom weighting, making them more flexible for scenarios beyond simple time‑based ordering.

3. Binary State Statistics

For true/false flags such as sign‑in (1) or not (0), Redis Bitmap (a string‑based bit array) is memory‑efficient. Example workflow for a user’s monthly attendance: SETBIT userid:10001:202011 1 1 marks day 2 as signed in. To check the flag: GETBIT userid:10001:202011 1 To count total sign‑ins in the month: BITCOUNT userid:10001:202011 For continuous‑day analysis, multiple daily Bitmaps can be combined with bitwise AND, OR, XOR operations, then counted with BITCOUNT. The article estimates 1 × 10⁸ bits ≈ 12 MB per day, so 20 days ≈ 240 MB.

4. Cardinality Statistics

Counting unique elements (e.g., page UV) can be done with Sets, but memory usage grows quickly. Redis HyperLogLog offers approximate cardinality with ~12 KB for 2⁶⁴ elements, at ~0.81% error. Example commands:

PFADD p1:uv 10001 10002 10003 10004

PFCOUNT p1:uv

HyperLogLog automatically deduplicates; for exact counts, use Sets.

Conclusion

The article summarizes the strengths and limitations of each Redis data type for statistical use cases: Sets support intersection/union, Sorted Sets add score‑based ordering, Bitmaps excel at binary state tracking with minimal memory, and HyperLogLog provides ultra‑compact approximate cardinality. Choosing the right structure balances accuracy, performance, and memory consumption.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend HyperLogLog Redis statistics Set Aggregation

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.