Artificial Intelligence 13 min read

Xiaohongshu Search Engine Innovations Presented at SIGIR-AP 2023

At SIGIR‑AP 2023 in Beijing, Xiaohongshu’s technical team unveiled four key innovations—advanced user‑intent analysis via multi‑stage LLM pre‑training, multimodal vector retrieval, generative inverted‑index enhancements, and a three‑stage relevance‑ranking pipeline with knowledge distillation—to tackle high multi‑intent, long‑tail, and multimodal search challenges for its 260 million‑user platform.

Xiaohongshu Tech REDtech

Dec 4, 2023

Xiaohongshu Search Engine Innovations Presented at SIGIR-AP 2023

From November 26‑28, 2023, the ACM‑sponsored SIGIR‑AP conference was held in Beijing, jointly organized by Tsinghua University and the University of Melbourne. As the first regional Information Retrieval (IR) conference in China and a CCF A‑class event, more than 100 researchers from academia and industry gathered to discuss frontier IR technologies and trends.

Prominent speakers included Maarten de Rijke (Academy of Arts and Sciences, Netherlands), Gerard Salton Award laureate, Chengxiang Zhai, Charles Clarke, Tetsuya Sakai, and many other distinguished scholars from institutions such as Tsinghua, Renmin University, Chinese Academy of Sciences, Waseda University, and NUS.

Representing industry, the Xiaohongshu technical team delivered a talk titled “Xiaohongshu Search Engine Innovation Practice.” With over 260 million monthly active users and nearly 300 million daily search queries, Xiaohongshu’s search engine aims to provide “ordinary‑person perspective, experienced insight” and “useful intelligence.” The presentation highlighted challenges such as high multi‑intent query ratios, long‑tail queries lacking statistical signals, and the need to understand multimodal content (text, images, video, live streams).

To address these challenges, the team described four major technical advances:

1. User Intent Analysis – Short‑text understanding is enhanced through multi‑stage continuous pre‑training of large language models, including unsupervised domain pre‑training, weakly supervised log pre‑training, and fine‑tuning on manually annotated data. For long‑tail queries, knowledge‑base and entity‑linking techniques enrich the model; for head queries, log mining and system simulation provide posterior data.

2. Vector Retrieval – Multimodal representation learning is applied to both long‑tail and head queries. Cross‑modal alignment aligns note images and text with query embeddings; multimodal fusion incorporates Masked Language Modeling (MLM) and Masked Image Modeling (MIM); hard negative samples are constructed by masking, rewriting, and replacing query‑image pairs.

3. Inverted Index – Three practices improve traditional recall: (a) generating queries for low‑exposure notes using generative models; (b) converting video content to transcribed text for indexing; (c) extracting chapter‑level tags from notes, filtering irrelevant hashtags, and using weak‑supervised training to enhance multimodal understanding.

4. Relevance Ranking – A three‑stage training pipeline (unsupervised pre‑training on internal text, continuous supervised training on search logs, fine‑tuning on annotated data) is employed. Model efficiency is boosted via knowledge distillation (48‑layer BERT → 12‑layer/4‑layer), Faster Transformer, dynamic padding, and dynamic summarization. Multimodal relevance is further modeled with contrastive losses on queries vs. core words and notes vs. similar notes.

The conference also featured a keynote by Maarten de Rijke on “Simulation for Recommendations in Dynamic and Interactive Environments,” discussing how simulators can help evaluate recommendation systems under changing user preferences, bias, and noise.

SIGIR‑AP 2023 concluded successfully, fostering academic‑industry exchange and setting the stage for future research and innovation in information retrieval and AI‑driven search technologies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Search Engine Vector Retrieval Xiaohongshu multimodal semantic understanding SIGIR-AP

Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.