New Precise Matching Techniques from JD’s SIGIR 2025 Papers

JD's retail technology team presents five SIGIR 2025 papers that introduce advanced graph neural, causal optimal transport, domain‑oriented relevance, multi‑objective bid‑word generation, and hierarchical user behavior models to dramatically improve precise matching in e‑commerce search, recommendation, and advertising.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
New Precise Matching Techniques from JD’s SIGIR 2025 Papers

Tech Insight is a JD retail technology column that continuously shares the latest research papers and technical findings. Recently, five JD retail‑tech papers were accepted to SIGIR 2025, an A‑class international conference on information retrieval with a 21.5% acceptance rate. These papers focus on the problem of “precise matching” in e‑commerce search, recommendation, and ad placement, aiming to help users find desired items faster and to improve ad click‑through‑rate (CTR) prediction.

Graph Isomorphism Network‑Based Cohort Modeling in Click‑Through Rate Prediction

Chinese Title: 基于图同构网络的群组建模在点击率预测中的应用

Download URL: https://dl.acm.org/doi/10.1145/3726302.3731936

Authors: Xuan Ma, Hao Peng, Jia Duan, Zhanhao Ye, Langlang Ye, Zehua Zhang, Jie He, Changping Peng, Zhangang Lin

Abstract: CTR prediction often suffers from cold‑start users lacking historical behavior. Existing encoder‑decoder approaches generate virtual behavior representations but are limited by oversimplified encoding of active users and restricted interest expression. This work proposes a Graph Isomorphism Network (GIN)‑based cohort modeling method that captures high‑order user‑item interactions, reduces embedding bias, and improves generalization. Experiments on public and industrial datasets show significant gains for both active and cold‑start users.

Figure for GIN cohort modeling
Figure for GIN cohort modeling

Post‑event Modeling via Causal Optimal Transport for CTR Prediction

Chinese Title: 基于因果最优传输的后验信息建模用于CTR预测

Download URL: https://dl.acm.org/doi/10.1145/3726302.3731942

Authors: Yizhou Sang, Congcong Liu, Yuying Chen, Zhiwei Fang, Xue Jiang, Changping Peng, Zhangang Lin, Ching Law, Jingping Shao

Abstract: Accurate CTR prediction relies on post‑event features (e.g., dwell time) that are unavailable at inference time, causing training‑inference mismatch and low coverage. The proposed Causal Optimal Transport (COT) framework generates pseudo post‑event features via semi‑supervised labeling, shapes their distribution with a Causal Distribution Shaper, and aligns distributions using optimal transport. Experiments on real data demonstrate that COT improves user interest modeling and mitigates bias, leading to superior CTR performance.

Figure for COT framework
Figure for COT framework

ADORE: Autonomous Domain‑Oriented Relevance Engine for E‑commerce

Chinese Title: 基于领域自适应的电商相关性判别系统

Download URL: https://dl.acm.org/doi/10.1145/3726302.3731944

Authors: Ming Pang, Chunyuan Yuan, Xiaoyu He, Zheng Fang, Donghao Xie, Fanyi Qu, Xue Jiang, Changping Peng, Zhangang Lin, Zheng Luo, Jingping Shao

Abstract: To address data‑scarcity and weak inference of shallow online relevance models, ADORE introduces a chain‑of‑thought large model that automatically generates domain‑specific hard samples and aligns them with online user behavior via a KTO reinforcement‑learning algorithm. It also builds a error‑type‑aware generator for adversarial samples. Knowledge is transferred to shallow models through key attribute extraction, yielding significant improvements in relevance metrics and ad revenue in large‑scale online A/B tests.

Figure for ADORE system
Figure for ADORE system

Multi‑objective Aligned Bidword Generation Model for E‑commerce Search Advertising

Chinese Title: 多目标对齐广告买词生成模型用于电商搜索广告

Download URL: https://arxiv.org/abs/2506.03827

Authors: Zhenhui Liu, Chunyuan Yuan, Ming Pang, Zheng Fang, Li Yuan, Xue Jiang, Changping Peng, Zhangang Lin, Zheng Luo, Jingping Shao

Abstract: Search advertising suffers from long‑tail queries that lack matching keywords, reducing retrieval efficiency. The proposed MoBGM combines a discriminator, generator, and preference‑alignment module to jointly optimize relevance, authenticity, and platform revenue. Extensive offline and online experiments show MoBGM outperforms state‑of‑the‑art methods and delivers substantial commercial value.

Figure for MoBGM
Figure for MoBGM

Hierarchical User Long‑term Behavior Modeling for Click‑Through Rate Prediction

Chinese Title: 层次化用户长期行为建模在点击率预估中的应用

Download URL: https://dl.acm.org/doi/10.1145/3726302.3730207

Authors: Mao Pan, Xuanhua Yang, Nan Qiao, Dongyue Wang, Feng Mei, Xiwei Zhao, Sulong Xu

Abstract: Transformer‑based CTR models struggle with long user behavior sequences under strict latency constraints. This work proposes an end‑to‑end hierarchical behavior modeling network (HBM) that routes long‑term actions into multiple interest clusters, selects top‑k interests via a fine‑grained interest network, and refines them with a Transformer. Online A/B tests on JD’s recommendation platform show large performance gains.

Figure for hierarchical behavior model
Figure for hierarchical behavior model

All five papers demonstrate how cutting‑edge AI techniques can overcome technical bottlenecks in complex e‑commerce scenarios and have been validated to drive growth in real‑world JD business environments.

e-commerceAdvertisingCTR predictiongraph neural networksrelevance modelingcausal optimal transport
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.