How to Use Redis Sets for Powerful Intersection, Union, and Difference Statistics
This article explains how to leverage Redis Set, Sorted Set, Bitmap, and HyperLogLog data structures for various statistical scenarios such as user sign‑in tracking, friend recommendations, comment ranking, and UV counting, while addressing performance and memory considerations.
Introduction
Redis is often used merely as a cache, but its native data structures—especially Set, Sorted Set, Bitmap, and HyperLogLog —are powerful tools for large‑scale statistical analysis in sign‑in systems, e‑commerce, and social networking applications.
Aggregation Statistics
Aggregating multiple sets allows you to compute intersections, unions, and differences, which are essential for tasks like finding common friends, daily new friends, or recent comments.
Intersection
To find common friends between two users, store each user's friend IDs in a set and use SINTERSTORE:
SINTERSTORE userid:new userid:20002 userid:20003After execution, the key userid:new contains the intersection of userid:20002 and userid:20003.
Difference
For daily new‑friend counts, compute the difference between two consecutive day sets:
SDIFFSTORE user:new userid:20201102 userid:20201101The resulting set user:new holds the friends added on 2020‑11‑02 that were not present on 2020‑11‑01.
Union
To count total new friends over two days, union the daily sets:
SUNIONSTORE userid:new userid:20201102 userid:20201101The key userid:new now contains all friends added on either day.
Sorting Statistics
Ordered collections are needed for scenarios like displaying the latest comments first. Redis provides two ordered types:
List : preserves insertion order; pagination via LRANGE.
Sorted Set : orders by a user‑defined score; pagination via ZRANGEBYSCORE.
Lists are simple but limited to chronological order, while Sorted Sets allow flexible ranking based on any weight (e.g., timestamp, popularity).
Binary State Statistics
When a value is simply 0 or 1—such as sign‑in status—Redis Bitmap (implemented on top of String) offers extreme space efficiency. Each bit represents a user’s daily sign‑in.
Typical commands: SETBIT key offset 1 – mark a day as signed in. GETBIT key offset – check sign‑in status. BITCOUNT key – count total signed‑in days.
Key design example: userid:yyyyMM stores a month’s bitmap for a specific user.
SETBIT userid:10001:202011 1 1 GETBIT userid:10001:202011 1 BITCOUNT userid:10001:202011For continuous‑day analysis (e.g., users who signed in for 20 consecutive days), perform bitwise AND across the daily bitmaps and then BITCOUNT the result.
Cardinality Statistics
Counting unique elements (UV) with a plain Set can be memory‑intensive for massive traffic. Redis HyperLogLog provides approximate cardinality with a fixed ~12 KB footprint, at the cost of a small error (~0.81%).
PFADD p1:uv 10001 10002 10003 10004 PFCOUNT p1:uvWhen exact counts are required, fall back to a Set.
Overall Summary
Redis offers a rich set of data structures for statistical workloads: Set and Sorted Set support intersection and union; only Set supports difference. Bitmap can perform bitwise AND/OR/XOR across multiple keys, ideal for binary state tracking. List and Sorted Set both enable ordered retrieval, but Sorted Set provides flexible scoring.
For high‑cardinality, low‑precision counting, use HyperLogLog; for precise counting, use Set.
When dealing with very large datasets, consider offloading aggregation to a dedicated replica or performing it on the client side to avoid blocking the primary Redis instance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
