Artificial Intelligence 24 min read

Decentralized Distribution in Xiaohongshu: Strengthening Sideinfo, Multimodal Fusion, and Interest Exploration

This article details Xiaohongshu's technical approaches to decentralized content distribution, covering business background, core challenges, high‑frequency recommendation pipelines, link‑level analysis, sideinfo decoupling, graph‑model integration, multimodal signal fusion, explicit interest exploration, interest protection, and future research directions.

DataFunSummit

Oct 29, 2024

Decentralized Distribution in Xiaohongshu: Strengthening Sideinfo, Multimodal Fusion, and Interest Exploration

Xiaohongshu, a rapidly growing UGC community in China, generates billions of content exposures daily, with a dual‑column feed and video immersion flow, making decentralized distribution a critical problem.

The core issues are twofold: the system must learn faster to capture long‑tail content and user interests, and it must learn better to ensure these signals are effectively propagated through recall, coarse ranking, fine ranking, and post‑ranking stages.

To accelerate learning, Xiaohongshu upgraded its recommendation pipeline from daily to minute‑level updates, introducing GPU‑heterogeneous training for early‑fine ranking models.

Link analysis revealed over‑reliance on note IDs, insufficient content‑signal utilization, and self‑reinforcing feedback loops, prompting solutions that enhance side‑information (sideinfo) usage, incorporate multimodal signals, and protect user interests.

Sideinfo decoupling separates sideinfo modeling from note ID embeddings, applying dedicated attention mechanisms and residual connections in both recall and ranking modules, and further integrates sideinfo into graph models with CF‑based edges, swing algorithm upgrades, and causal‑inference bias correction.

Multimodal signal fusion progresses from contrastive learning to the AlignRec framework, aligning text and image embeddings with item embeddings, and employing multimodal cross‑feature interactions in recall, coarse ranking, and fine ranking stages.

Interest exploration shifts from implicit multi‑vector methods to explicit global interest sets, selecting top‑k interests based on recent user behavior and predicting the next interest, while exploration‑exploitation (EE) adds Gaussian noise to recall vectors and uses evolution‑strategy‑driven regeneration of high‑potential items.

Interest protection introduces a white‑box multi‑queue framework for the intermediate ranking stages, allowing customized ordering and automatic cutoff optimization via evolution strategies.

Future work includes generative recommendation, interactive search with large models, multimodal user profiling, end‑to‑end multimodal‑behavior joint training, and full‑domain signal modeling across the entire app.

Overall, the presentation showcases Xiaohongshu's ongoing efforts to achieve decentralized distribution through algorithmic innovations and large‑model techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Multimodal Learning decentralized-distribution graph models interest-exploration sideinfo

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.