Artificial Intelligence 12 min read

Xiaohongshu Recommendation Engineering Architecture: Graph‑Based Design and Hot‑Deployment Practices

This article presents Xiaohongshu's evolving recommendation system architecture, detailing the challenges of massive user‑generated content, the adoption of a graph‑based Ark framework for modular and scalable business logic, and the implementation of hot‑deployment techniques to accelerate algorithm iteration and reduce downtime.

DataFunSummit

Sep 12, 2023

Xiaohongshu Recommendation Engineering Architecture: Graph‑Based Design and Hot‑Deployment Practices

With the rapid growth of mobile internet, personalized recommendation has become essential for user experience; Xiaohongshu, a lifestyle platform for young users, faces massive user‑generated content and complex business logic, prompting a share of their graph‑based recommendation architecture to inspire others.

The existing recommendation pipeline handles diverse content types (text, video, products, live streams, comments) through multi‑stage recall, extensive feature engineering, and ranking (coarse, fine, re‑ranking), processing millions of candidates and confronting scalability, maintainability, and long deployment cycles.

To address these issues, Xiaohongshu rebuilt the system on a hybrid‑cloud foundation, introducing a data platform for core business data, an engine layer offering inverted‑index, vector, feature, and ranking services (implemented in C++ for performance), and the Ark graph‑computation framework (Java‑based) that abstracts common infrastructure for search and recommendation scenarios.

The Ark framework comprises an API gateway for traffic control and routing, and a container layer with datasets and operators; it provides parallel processing, dynamic routing, and sub‑graph nesting, enabling rapid construction of new recommendation scenes while isolating business logic from low‑level concerns.

Hot‑deployment is achieved via Spring‑based class‑loader isolation and a plugin mechanism; independent class loaders allow versioned business code to run side‑by‑side, with AB routing for traffic shifting, pre‑warming, and seamless switch‑over without full service restarts, while addressing challenges such as cache duplication, middleware resource release, and class conflicts.

Future directions include extending hot‑deployment to production, moving toward serverless architectures for lower operational cost, and implementing elastic scaling to handle traffic spikes, especially for high‑demand scenarios like live‑stream recommendation.

The session concludes with a Q&A covering hot‑load complexities, serialization choices (Thrift vs. Protobuf), and strategies for handling sudden traffic surges through dynamic degradation and fast‑path computation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

architecture recommendation AI scalability Graph Computation hot-deployment

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.