Backend Development 9 min read

How MX Player Cut Memory Use by 96% with Bloom Filters

This article explains how MX Player replaced a Redis‑based user‑history list with per‑user Bloom Filters, dramatically reducing memory consumption and improving filtering efficiency for its recommendation system.

MXPlayer Technical Team

Aug 22, 2019

How MX Player Cut Memory Use by 96% with Bloom Filters

Introduction

MX Player, one of India’s top video platforms, needs to filter videos a user has already watched from recommendation results to ensure diversity and improve experience. Storing and efficiently filtering massive user‑history data became a challenge, which was addressed by using Bloom Filters.

Previous Solution (Solution 1)

Initially, user history (item‑id list) was stored in a Redis sorted set (zset). Because histories could be large, only the most recent 1,400 items were kept. Each request fetched the entire history, performed filtering, and then pushed the updated list back to Redis. This approach consumed a lot of Redis memory, required two large data transfers per request, and increased server memory usage.

Bloom Filter History Filtering (Solution 2)

Bloom Filter Overview

A Bloom Filter is a space‑efficient probabilistic data structure. When an element is added, K hash functions map it to K bits in a bit array, setting those bits to 1. To test membership, the same K bits are checked: if any is 0, the element is definitely absent; if all are 1, the element is probably present.

Bloom Filter in MX Player

MX Player’s recommendation system uses Google Guava’s BloomFilter implementation. Each user gets a Bloom Filter with capacity 10,000 and a false‑positive rate of 0.01. Because candidate recommendation lists are abundant and heavy users may accumulate large histories, a relatively large capacity and acceptable false‑positive rate were chosen. For users with few history items, the actual false‑positive rate is far lower than 0.01.

History Storage Process

Before returning recommendation results, each item’s ID is added to the user’s Bloom Filter. The Bloom Filter is then exported as a byte array and stored in Redis with a 7‑day TTL, while the raw item IDs are asynchronously stored in a Pika zset for later reference.

History Filtering Process

When filtering, the byte array is retrieved from Redis, reconstructed into a Guava BloomFilter object, and used to test whether candidate items have been seen before.

Comparison of the Two Solutions

Storage

Storing the same amount of history, a Bloom Filter occupies at least an order of magnitude less memory than an item list. A Guava Bloom Filter with capacity 10,000 and false‑positive rate 0.01 produces a byte array of 11,990 bytes (~12 KB). In contrast, storing 10,000 item IDs as 32‑character hex strings requires about 312 KB, a 96 % memory saving. The raw data is kept in Pika (an SSD‑based KV store), which is cheaper than in‑memory Redis.

Efficiency

Solution 1 writes to Redis either by looping async adds or by batching a single ZADD command; both have similar performance. Solution 2 simply stores the Bloom Filter byte array with a single SET command. From a coding perspective, Bloom Filter handling is far simpler; network transfer is smaller; and Redis executes fewer commands, making Solution 2 considerably faster overall.

Extensions

Choosing a Bloom Filter Implementation

Since Redis 4.0, modules are supported, and RedisBloom provides a native Bloom Filter implementation. Using RedisBloom eliminates the need to export Guava Bloom Filters. However, if the system uses session‑affinity load balancing, keeping a Guava Bloom Filter locally and caching it can greatly improve performance, which RedisBloom cannot replicate.

Bloom Filter False‑Positive Rate

Assuming hash functions choose bits uniformly and independently, with bit array size m, number of hash functions k, and inserted elements n, the false‑positive probability is approximately (1‑e^{‑kn/m})^{k}. By adjusting m and k based on expected n, an acceptable false‑positive rate can be achieved, as described in the Bloom Filter chapter of "The Beauty of Mathematics, Second Edition".

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory Optimization redis bloom-filter

Written by

MXPlayer Technical Team

Technical articles and experience sharing. MXPLAYER is the top-ranked online video content platform in India, and also the world's largest player app, with 100M+ DAU and hundreds of millions of MAU.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.