Artificial Intelligence 23 min read

Enterprise Knowledge Recommendation System at Alibaba: Architecture, Challenges, and Large Model Applications

This article presents Alibaba's enterprise knowledge recommendation system, detailing its role in digital transformation, the challenges of long‑document recommendation, the multi‑layer architecture spanning feature, engine, ranking, and functional layers, various recall strategies, progressive ranking models, and the integration and evaluation of large language models for improved recommendation performance.

DataFunSummit

Jan 8, 2024

Enterprise Knowledge Recommendation System at Alibaba: Architecture, Challenges, and Large Model Applications

The presentation begins with an overview of enterprise digital transformation, describing three stages—informationization, digitization, and intelligence—and emphasizing the crucial support provided by technical communities such as Alibaba's internal ATA platform.

It then outlines the value and challenges of enterprise knowledge recommendation, highlighting short content lifecycles, long document length, high knowledge barriers, and cold‑start problems caused by sparse user behavior.

The core architecture of Alibaba's knowledge recommendation system is described in four layers:

Feature layer: content, attribute, context, and collaborative‑filtering features.

Engine layer: offline data‑works processing, OSS storage, machine‑learning training, and online services (RTP, TPP, graph storage).

Ranking layer: multi‑stage recall, coarse ranking, fine ranking, and re‑ranking, with six recall methods (collaborative, tag‑based, semantic, rule‑based, etc.).

Function layer: homepage recommendation, technical‑person recommendation, search‑term recommendation, and related‑article recommendation.

Recall methods include collaborative (Swing and relationship graphs), tag‑based (category and skill tags), semantic (BERT embeddings), and popularity‑based approaches, with tag‑based recall showing the best click‑through performance.

The fine‑ranking stage evolves from linear FTRL models to increasingly sophisticated deep models such as Wide&Deep, DeepFM, DCN, AFM, NFM, DIN, AutoInt, FiBiNet, and reinforcement‑learning‑based DRN, ultimately adopting a Wide&DCN model that improves CTR by about 10% over FTRL.

To address long‑document and user‑interest challenges, a large‑model recommendation framework is introduced, comprising an Article Encoder (feature embedding, BERT title encoder, LLM content encoder using ChatGLM‑6B), an Instant User Interest (IUI) module with cross‑attention over recent browsing sequences, and a Constant User Interest (CUI) module that incorporates user profiles via LLM encoders.

Extensive offline and online evaluations show that the proposed large‑model approach outperforms six SOTA baselines on AUC, MRR, and nDCG metrics, and ablation studies confirm the importance of both constant and instant user interest components.

The article concludes by summarizing the system’s architecture, multi‑recall implementation, progressive ranking models, and the ongoing large‑model research aimed at better representing long technical documents and dynamic user interests.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Alibaba .ai recommendation system large language model enterprise digital transformation knowledge recommendation

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.