New Precise Matching Techniques from JD’s SIGIR 2025 Papers
JD's retail technology team presents five SIGIR 2025 papers that introduce advanced graph neural, causal optimal transport, domain‑oriented relevance, multi‑objective bid‑word generation, and hierarchical user behavior models to dramatically improve precise matching in e‑commerce search, recommendation, and advertising.
Tech Insight is a JD retail technology column that continuously shares the latest research papers and technical findings. Recently, five JD retail‑tech papers were accepted to SIGIR 2025, an A‑class international conference on information retrieval with a 21.5% acceptance rate. These papers focus on the problem of “precise matching” in e‑commerce search, recommendation, and ad placement, aiming to help users find desired items faster and to improve ad click‑through‑rate (CTR) prediction.
Graph Isomorphism Network‑Based Cohort Modeling in Click‑Through Rate Prediction
Chinese Title: 基于图同构网络的群组建模在点击率预测中的应用
Download URL: https://dl.acm.org/doi/10.1145/3726302.3731936
Authors: Xuan Ma, Hao Peng, Jia Duan, Zhanhao Ye, Langlang Ye, Zehua Zhang, Jie He, Changping Peng, Zhangang Lin
Abstract: CTR prediction often suffers from cold‑start users lacking historical behavior. Existing encoder‑decoder approaches generate virtual behavior representations but are limited by oversimplified encoding of active users and restricted interest expression. This work proposes a Graph Isomorphism Network (GIN)‑based cohort modeling method that captures high‑order user‑item interactions, reduces embedding bias, and improves generalization. Experiments on public and industrial datasets show significant gains for both active and cold‑start users.
Post‑event Modeling via Causal Optimal Transport for CTR Prediction
Chinese Title: 基于因果最优传输的后验信息建模用于CTR预测
Download URL: https://dl.acm.org/doi/10.1145/3726302.3731942
Authors: Yizhou Sang, Congcong Liu, Yuying Chen, Zhiwei Fang, Xue Jiang, Changping Peng, Zhangang Lin, Ching Law, Jingping Shao
Abstract: Accurate CTR prediction relies on post‑event features (e.g., dwell time) that are unavailable at inference time, causing training‑inference mismatch and low coverage. The proposed Causal Optimal Transport (COT) framework generates pseudo post‑event features via semi‑supervised labeling, shapes their distribution with a Causal Distribution Shaper, and aligns distributions using optimal transport. Experiments on real data demonstrate that COT improves user interest modeling and mitigates bias, leading to superior CTR performance.
ADORE: Autonomous Domain‑Oriented Relevance Engine for E‑commerce
Chinese Title: 基于领域自适应的电商相关性判别系统
Download URL: https://dl.acm.org/doi/10.1145/3726302.3731944
Authors: Ming Pang, Chunyuan Yuan, Xiaoyu He, Zheng Fang, Donghao Xie, Fanyi Qu, Xue Jiang, Changping Peng, Zhangang Lin, Zheng Luo, Jingping Shao
Abstract: To address data‑scarcity and weak inference of shallow online relevance models, ADORE introduces a chain‑of‑thought large model that automatically generates domain‑specific hard samples and aligns them with online user behavior via a KTO reinforcement‑learning algorithm. It also builds a error‑type‑aware generator for adversarial samples. Knowledge is transferred to shallow models through key attribute extraction, yielding significant improvements in relevance metrics and ad revenue in large‑scale online A/B tests.
Multi‑objective Aligned Bidword Generation Model for E‑commerce Search Advertising
Chinese Title: 多目标对齐广告买词生成模型用于电商搜索广告
Download URL: https://arxiv.org/abs/2506.03827
Authors: Zhenhui Liu, Chunyuan Yuan, Ming Pang, Zheng Fang, Li Yuan, Xue Jiang, Changping Peng, Zhangang Lin, Zheng Luo, Jingping Shao
Abstract: Search advertising suffers from long‑tail queries that lack matching keywords, reducing retrieval efficiency. The proposed MoBGM combines a discriminator, generator, and preference‑alignment module to jointly optimize relevance, authenticity, and platform revenue. Extensive offline and online experiments show MoBGM outperforms state‑of‑the‑art methods and delivers substantial commercial value.
Hierarchical User Long‑term Behavior Modeling for Click‑Through Rate Prediction
Chinese Title: 层次化用户长期行为建模在点击率预估中的应用
Download URL: https://dl.acm.org/doi/10.1145/3726302.3730207
Authors: Mao Pan, Xuanhua Yang, Nan Qiao, Dongyue Wang, Feng Mei, Xiwei Zhao, Sulong Xu
Abstract: Transformer‑based CTR models struggle with long user behavior sequences under strict latency constraints. This work proposes an end‑to‑end hierarchical behavior modeling network (HBM) that routes long‑term actions into multiple interest clusters, selects top‑k interests via a fine‑grained interest network, and refines them with a Transformer. Online A/B tests on JD’s recommendation platform show large performance gains.
All five papers demonstrate how cutting‑edge AI techniques can overcome technical bottlenecks in complex e‑commerce scenarios and have been validated to drive growth in real‑world JD business environments.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
