How Bloom Filters Supercharge Redis: From BitSet to Redisson
This article explains the Bloom filter data structure, its hash‑based design and false‑positive behavior, then demonstrates practical implementations using Java BitSet, Guava, and Redisson, covering initialization, configuration, usage tips, and performance considerations for Redis bitmap storage.
Bloom Filter Overview
A Bloom filter is a probabilistic data structure invented by Bloom that quickly determines whether an element is definitely not in a set or possibly in it, using multiple hash functions and a fixed‑size bit array.
When all bits for a hashed element are set to 1, the element may exist; if any bit is 0, it definitely does not exist. This yields fast queries with a controllable false‑positive rate.
Design Origin
The filter relies on a limited‑length binary bitmap where each hash function maps an element to a bit index. Only when all corresponding bits are 1 is the element considered possibly present.
Java BitSet Implementation
import java.util.BitSet;
public class BloomFilter {
private final BitSet bitSet;
private final int numHashFunctions;
public BloomFilter(int size, int numHashFunctions) {
this.bitSet = new BitSet(size);
this.numHashFunctions = numHashFunctions;
}
public void add(String value) {
for (int i = 0; i < numHashFunctions; i++) {
int hash = hash(value, i);
bitSet.set(hash);
}
}
public boolean contains(String value) {
for (int i = 0; i < numHashFunctions; i++) {
int hash = hash(value, i);
if (!bitSet.get(hash)) {
return false;
}
}
return true;
}
private int hash(String value, int i) {
int hash = value.hashCode();
return Math.abs((hash + i) % bitSet.size());
}
public static void main(String[] args) {
BloomFilter bloom = new BloomFilter(1 << 24, 3);
bloom.add("apple");
bloom.add("banana");
bloom.add("cherry");
System.out.println("apple in filter: " + bloom.contains("apple"));
System.out.println("banana in filter: " + bloom.contains("banana"));
System.out.println("cherry in filter: " + bloom.contains("cherry"));
System.out.println("grape in filter: " + bloom.contains("grape"));
}
}Guava Implementation
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;
public class BloomFilterDemo {
public static void main(String[] args) {
int expectedInsertions = 100_000_000;
double fpp = 0.01;
BloomFilter<Integer> bloom = BloomFilter.create(Funnels.integerFunnel(), expectedInsertions, fpp);
for (int i = 0; i < expectedInsertions; i++) {
bloom.put(i);
}
int falsePositives = 0;
for (int i = expectedInsertions; i < expectedInsertions * 2; i++) {
if (bloom.mightContain(i)) {
falsePositives++;
}
}
System.out.println(falsePositives);
}
}Redisson Implementation
// Create a Bloom filter with 100 million expected elements and 1% false‑positive rate
RBloomFilter<String> bloom = redissonClient.getBloomFilter("myBloomFilter");
bloom.tryInit(100_000_000, 0.01);
// Add elements
bloom.add("apple");
bloom.add("banana");
// Query elements
System.out.println("apple exists? " + bloom.contains("apple"));
System.out.println("orange exists? " + bloom.contains("orange"));Configuration Bean (Spring)
@Bean
public RBloomFilter<String> bloomFilter(RedissonClient redissonClient,
@Value("${bloomFilter.dynamic.expectedInsertions}") long expectedInsertions) {
RBloomFilter<String> filter = redissonClient.getBloomFilter("dynamicBloomFilter");
filter.tryInit(expectedInsertions, 0.01);
return filter;
}Practical Tips
Initialize the bitmap size and false‑positive rate according to expected element count; for 100 M items with 1% error the bitmap is about 959 MB and requires ~7 hash functions.
Rebuilding historic data requires batch insertion via scheduled tasks.
When a single key grows too large, split the filter into multiple sub‑filters (e.g., by month) to maintain performance.
Redis Storage Details
Redisson stores a Bloom filter using two Redis keys: {name}:config for filter parameters and {name} for the binary bitmap.
Example Redis entries:
Conclusion
Bloom filters provide a space‑efficient, fast‑lookup solution for massive data sets with an acceptable false‑positive rate. Properly tuning size and error probability, and using tools like Guava or Redisson, can significantly improve system throughput in Redis‑backed applications.
Lin is Dream
Sharing Java developer knowledge, practical articles, and continuous insights into computer engineering.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
