Databases 6 min read

How to Count Website Visits with Redis: Hash, Bitset, and HyperLogLog

This article explains three Redis‑based techniques—using hash tables, bitsets, and the HyperLogLog probabilistic algorithm—to accurately count daily page views, detailing the required commands, implementation steps, advantages, and limitations for high‑traffic sites.

ITPUB
ITPUB
ITPUB
How to Count Website Visits with Redis: Hash, Bitset, and HyperLogLog

Background

Pinduoduo, with billions of users, needs an efficient way to count page visits for each webpage. Redis, a fast in‑memory data store, offers several data structures that can be leveraged for this purpose.

Method 1: Using a Hash

Redis hashes map a composite key (e.g., URI+date) to fields representing individual users. When a user visits:

If the user is logged in, use their user ID as the field.

If not logged in, generate a random identifier for the field.

Store a constant value (e.g., 1) with the HSET command. To obtain the daily unique visitor count for a page, call HLEN on the hash key.

Pros: Simple to implement, fast queries, high accuracy.

Cons: Memory consumption grows with the number of keys; for sites with billions of page‑views the hash can become too large.

Method 2: Using a Bitset

A bitset stores one bit per possible user ID. For a 32‑bit integer, each bit can represent a distinct user, reducing memory usage by up to 32× compared with storing full IDs.

Redis provides the SETBIT command to set a bit for a user and GETBIT to query it. After processing all visits for a day, BITCOUNT returns the total number of set bits, i.e., the unique visitor count.

Pros: Very low memory footprint; suitable for massive traffic.

Cons: If user IDs are sparse, the bitset may still consume more memory than a hash; additional mapping is needed for anonymous users.

Method 3: Using a Probabilistic Algorithm (HyperLogLog)

When exact counts are not required, Redis’s built‑in HyperLogLog offers an approximate cardinality estimator with minimal memory usage (≈12 KB per key). The workflow is:

On each visit, execute PFADD with a unique identifier (e.g., user ID or random token).

To retrieve the estimated unique visitor count, run PFCOUNT.

The algorithm typically incurs about 0.81 % error, which is acceptable for large‑scale analytics.

Pros: Extremely small memory consumption; ideal for sites with hundreds of millions of users.

Cons: Individual user queries are not reliable; the count is an estimate, not an exact figure.

Conclusion

Redis provides three practical approaches for counting website visits: hash tables for exact counts with moderate memory, bitsets for highly memory‑efficient exact counts, and HyperLogLog for approximate counts with negligible memory overhead. The choice depends on traffic volume, accuracy requirements, and available resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

HyperLogLogredisHashWeb AnalyticsBitset
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.