Inside Mogujie's Search Engine: Architecture and Real‑Time Ranking Flow

This article details Mogujie's end‑to‑end search system architecture, describing both offline and online components such as Topn, ABTest, QR, UPS, the search engine, precision ranking, and feature management, and walks through a concrete online request example from query to final ranked results.

21CTO
21CTO
21CTO
Inside Mogujie's Search Engine: Architecture and Real‑Time Ranking Flow
Preface

Mogujie's vision is to make half of humanity happier, and for female users to easily find their desired products is the goal of its search system. As a key traffic entry, the search system optimizes merchant traffic allocation and improves user experience by ranking the most relevant and personalized items first. With the group's quality‑upgrading strategy, algorithmic ranking upgrades raise the system's requirements.

Existing Search Architecture

The overall architecture consists of two major parts: online and offline. The online part handles real‑time requests (including the operation platform and ranking platform) and includes the business layer, placement layer, precision ranking layer, and engine layer. The offline part covers algorithm training and data pipelines (e.g., ACM data collection, dump).

Core Systems

Topn is the unified entry point that abstracts different search businesses via a common interface and routing protocol. It forwards requests to various search engines and ranking systems, integrates ABTest traffic splitting, and provides a configuration backend for parallel algorithm testing and deployment.

ABTest supports multiple split rules (UUID, hash, user tags) and custom conditions, with layered strategies to avoid interference. A unified console combined with ACM logs enables real‑time effect statistics for online algorithm evaluation, covering over 90% of traffic entries.

QR (Query Rewrite) expands user queries through tokenization, synonym expansion, category relevance prediction, brand weighting, etc. It offers a plugin architecture for flexible algorithm development.

Precision Ranking System performs personalized ranking using richer features and complex models, supporting frequent AB experiments and dynamic configuration.

Search Engine is a high‑performance C++ engine built on the proprietary ZIndex framework, supporting retrieval, filtering, aggregation, and multi‑stage ranking with a plugin‑based ranking framework.

UPS (User Profile System) stores offline and real‑time user behavior (clicks, adds to cart, purchases) to provide personalized signals for precision ranking, handling up to 100k QPS with sub‑3 ms latency.

Engine Operation Platform offers full lifecycle management (instance configuration, deployment, index building, monitoring) using Docker‑based containerization and a web console.

Algorithm Sorting Platform provides a visual backend for creating algorithm scenes, models, ranking strategies, scripts, and offline evaluation, streamlining algorithm deployment across recommendation and search scenarios.

Dump System standardizes data flow from upstream sources to downstream storage, supporting incremental, full, and small‑full loads to ensure data reliability.

Feature Management Platform centralizes feature definition, generation, storage, publishing, validation, and monitoring, allowing algorithm engineers to focus on model training.

ACM Data Collection System captures, cleans, and tracks user behavior data, delivering reliable inputs for model training and real‑time statistics.

Online Search Flow

A user typing “nike” in the Mogujie app triggers the following chain:

Topn receives the query, determines ABTest and routing configurations (whether to call UPS, which engine, whether to invoke precision ranking, etc.).

QR rewrites the query, adding brand weighting, category relevance, synonyms (e.g., “耐克”), and other expansions.

UPS fetches the user's historical and real‑time behavior (e.g., recent clicks) to supply personalization signals.

Search Engine combines the rewritten query and user signals, performs recall and coarse ranking using configured plugins (e.g., sort=ltr_test_antispam).

Precision Ranking System applies personalized re‑ranking and business‑specific rules, then returns the final top‑K results to the front end.

The configuration for a typical Topn request includes sorting code, ABTest settings, search engine selection, QR and UPS toggles, and precision ranking plugin/model specifications.

Conclusion

The article introduced Mogujie's current search system architecture and detailed the online request pipeline. While the system has evolved through many iterations, it will continue to evolve to efficiently support business and algorithmic needs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

e‑commerceBackend DevelopmentSearch Architectureonline request
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.