Backend Development 11 min read

How MX Player’s Recommendation Indexing System Boosts Performance and Stability

This article explains MX Player’s recommendation indexing architecture, detailing batch and incremental build processes, real‑time and Falcon monitoring, Redis storage design, and how these components together improve system stability, reduce latency, and enhance user experience.

MXPlayer Technical Team
MXPlayer Technical Team
MXPlayer Technical Team
How MX Player’s Recommendation Indexing System Boosts Performance and Stability

Overview

MX Player offers a massive video library managed by a CMS, while its recommendation system delivers personalized content to users. To keep recommendations up‑to‑date under heavy traffic, a dedicated indexing service (referred to as the build‑index system) was created, and this article outlines its architecture and supporting components.

System Overview

The build‑index system consists of three main parts: the build service, real‑time monitoring, and the Falcon monitoring platform. The Recommend Build Index Database (RBIDB) uses Redis as its storage layer.

Part One – Build Module

The core module includes three sub‑modules:

Batch build runs nightly during low traffic, pulling data from CMS Elasticsearch (ES) and loading it into RBIDB with the same logic as the online service, reducing compute overhead and improving stability.

Incremental build updates RBIDB in real time based on CMS adjustments, implemented with AWS Lambda.

HTTP interface provides operational endpoints for manual rebuilds of single resources or whole types.

Part Two – Real‑Time Monitoring Module

This module ensures data consistency between RBIDB and online ES results. It collects SNS messages triggered by CMS changes, processes them via SQS and Celery workers, and compares responses from RBIDB and ES. On mismatches, it automatically triggers rebuilds via the HTTP interface, records diff counts, and raises alerts (email and DingTalk) when thresholds are exceeded, removing problematic data from the support list.

Part Three – Falcon Monitoring Module

Separate from real‑time monitoring, Falcon periodically compares full data sets from CMS and RBIDB, sending email alerts and auto‑triggering rebuilds if discrepancies are found.

RBIDB Storage Structure

The RBIDB architecture comprises four components:

Current key map – provides data pointers and manages dual‑buffer index switching.

Key one and Key two – store the actual indexes, scheduled by the current key map.

Support resource – maintains the list of valid data under real‑time monitoring; problematic data are removed automatically.

Parameter store – keeps runtime parameters, monitoring data, retry counts, etc.

Build Process Flow

The core workflow combines batch and incremental builds. Batch builds run nightly, pulling data from CMS ES, while incremental builds run continuously, triggered by SNS messages from CMS changes. Together they ensure robustness and strong consistency.

Build Sequence Diagram

Scheduled nightly batch build starts, marking status as running and fetching the current index version.

Data are read from CMS ES, parsed, and loaded into a secondary index.

If SQS contains CMS change messages, they are processed and merged into the secondary index.

Lambda services handle incremental updates in real time, mirroring batch logic.

During incremental updates, the system checks batch status; if batch is not running, updates go directly to the active index, otherwise they are also queued in SQS.

After batch completion, accumulated SQS messages are consumed to synchronize any late updates.

Data pointers are switched, finalizing the batch build.

Notification mechanisms report build status and details.

Real‑Time Monitoring Sequence

This diagram shows how the monitoring module automatically validates data consistency and triggers alerts.

CMS updates fire SNS messages, which are captured by the monitoring system.

Celery workers invoke a simulated client to request data from both RBIDB and ES.

Responses are compared; mismatches trigger the HTTP interface to synchronize data and re‑queue the message.

Failure counts are recorded; exceeding thresholds removes the data from the support resource and sends email/DingTalk alerts.

Conclusion

The MX Player recommendation indexing service decouples the recommendation engine from the data layer, reducing online compute load and improving stability. Using the indexed data yields more consistent performance compared to direct ES queries, shortens cache lifetimes from minutes to seconds, and keeps user‑perceived latency within a few seconds even under high traffic.

Direct ES request performance versus indexed data request performance demonstrates that the indexed approach provides a more stable service with reduced latency.

Monitoringbackend architectureIndexingrecommendation systemRedisaws lambda
MXPlayer Technical Team
Written by

MXPlayer Technical Team

Technical articles and experience sharing. MXPLAYER is the top-ranked online video content platform in India, and also the world's largest player app, with 100M+ DAU and hundreds of millions of MAU.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.