JD.com’s Personalized Recommendation System: Architecture, Models, and Future Directions
The article explains how JD.com leverages big‑data and personalized recommendation algorithms across PC and mobile platforms, detailing its recall and ranking models, efficiency analysis, weekly algorithm iterations, and future AI‑driven optimizations that together contribute about 10% of its orders.
JD.com, using big data and personalized recommendation algorithms, displays different content to different users on both PC and mobile platforms, contributing roughly 10% of its orders. To explore the algorithmic intricacies behind the platform’s “one‑size‑one‑face” approach, CSDN interviewed Liu Shangkun, Director of the Recommendation Search Department.
JD.com Recommendation System Three‑Step Process
Overall, JD.com’s recommendation algorithm follows three steps: building a recall model, analyzing recall model efficiency, and applying a ranking model, though each step involves complex implementations.
Recall Model
The recall model, which generates candidate sets, operates from three dimensions: behavior‑based recall, preference‑based recall, and region‑based recall.
Behavior‑Based Recall : Recommends related or similar items based on a user’s purchase behavior. For example, a Kindle buyer receives suggestions for Kindle cases rather than unrelated products, while consumables like soap are recommended according to purchase cycles.
Preference‑Based Recall : Utilizes user profiles and cross‑device data (PC, mobile app, WeChat, QQ) that combine brand, target audience, price, and interaction signals (clicks, purchases, follows, favorites) to determine long‑term recommendable categories.
Region‑Based Recall : Divides the map into grids and uses statistical data; e.g., users in Sanlitun, Beijing, show higher interest in playing cards and water, while suburban schools favor socks and drying racks. This approach mainly serves new users with limited behavior history.
Recall Model Efficiency Analysis
Each recall sub‑model (online relevance, online similarity, offline relevance, offline similarity, recent hot‑sale brands, etc.) competes for a specific user. Model efficiency is judged by click‑through rate, share rate, GMV, and other metrics.
Liu notes that models built on real‑time user behavior (e.g., online relevance and similarity) generally perform better. JD.com therefore adopts a multi‑model fusion strategy, selecting the optimal combination per user to maximize overall traffic value.
Recent efficiency gains stem from two models: “recent click” (recommending recently viewed items) and “cart abandonment” (recommending items added to the cart but not purchased), achieving conversion improvements of up to 100% and 5‑10% respectively.
Ranking Model
On top of model fusion, JD.com applies learning‑to‑rank (L2R) techniques, converting the ranking problem into a classification task. Feature weights are learned from interaction logs, and the L2R algorithm yields a 20% conversion lift.
Experiments with various algorithms (logistic regression, Vowpal Wabbit, PMI, etc.) showed logistic regression only a 1% gain, whereas learning‑to‑rank delivered a 20% increase.
Weekly Iteration of Seven New Algorithms
Continuous iteration is essential as user preferences evolve. JD.com launches about seven new algorithm experiments each week, requiring a robust recommendation infrastructure built on HBase, Storm, Spark, and MapReduce.
Liu highlights Spark’s strong support for big‑data processing and its MLlib library, which simplifies the implementation of collaborative‑filtering algorithms commonly used in recommendation systems.
Experience and Future Optimization
Since the personalized homepage launched less than a month ago, Liu believes there is still a 40‑50% improvement space.
Key takeaways include:
Accumulate and refresh data to support iteration : JD.com leverages massive user and product data for both user‑based and item‑based collaborative filtering, and plans to incorporate Tencent’s social data for richer profiling.
Utilize open‑source tools : Real‑time behavior models and ranking algorithms heavily rely on Spark, and JD.com customizes open‑source algorithms to fit its business scenarios.
Efficiency as the benchmark : Models are evaluated using a combination of prior predictions and posterior analysis to identify the most effective model mix for each user.
Deep learning as the future direction : Feature selection remains challenging; enriching ranking features and applying DNNs are seen as primary avenues for future gains. JD.com’s DNN Lab already contributes to the intelligent customer service robot.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.