Artificial Intelligence 13 min read

Exploring Spatiotemporal Features and Adaptive Context Modeling for Online Food Recommendation (DCAM)

The paper introduces DCAM, a dynamic context‑adaptation model that automatically selects the most effective spatiotemporal features for online food recommendation, showing that more features or naïve self‑attention do not guarantee gains, and achieving superior offline AUC and online CTR improvements over existing state‑of‑the‑art methods.

Ele.me Technology
Ele.me Technology
Ele.me Technology
Exploring Spatiotemporal Features and Adaptive Context Modeling for Online Food Recommendation (DCAM)

The paper "Exploring the Spatiotemporal Features of Online Food Recommendation Service" has been accepted to the SIGIR 2023 Industry Track. The full paper is available at https://dl.acm.org/doi/abs/10.1145/3539618.3591853 .

01 Introduction

Previous works (StEN and BASM) attempted to model spatiotemporal characteristics in local life scenarios, but they either covered too many aspects with a single model or ignored the selection of the most effective spatiotemporal features. This motivated a deeper investigation of which spatiotemporal features are truly beneficial.

We propose a third work, DCAM, which focuses on exploring spatiotemporal features and sequences. Experiments show that more spatiotemporal features do not always lead to better performance; careful selection is required. Moreover, naive use of Self‑Attention on spatiotemporal sequences can amplify noise, harming downstream Target‑Attention performance. Therefore, we design a context‑adaptive model that automatically selects important spatiotemporal features.

02 Motivation

In local‑life recommendation, it is unclear which spatiotemporal features should be activated or reinforced, and whether more features are always better. Self‑Attention performs poorly in this domain compared with other fields, raising the question of whether an adaptive model can address these issues.

03 Spatiotemporal Feature Exploration

We examine six spatiotemporal attributes: "Hour", "Timetype", "Weekday", "Geohash", "CityID", and "AOIID". The baseline model (shown below) combines User, Item, User‑Behavior Sequence, and Context (the six spatiotemporal features). Context is fed into a Bias Net to prevent it from being masked by other features.

Experimental results indicate:

Adding spatiotemporal features does not guarantee performance gains; selective feature choice is effective.

Bias Net improves AUC for context features, but the gain is limited for GAUC.

Temporal features contribute more than spatial ones, yet some spatial features (e.g., AOIID) underperform due to low coverage.

04 Spatiotemporal Sequence Exploration

Various sequence models were tested. Findings include:

Self‑Attention yields the worst results, even worse than GRU, though GRU is computationally expensive.

Target Attention alone improves performance, but stacking Self‑Attention before it degrades results because pairwise matching amplifies noise in rapidly changing spatiotemporal sequences.

Target Attention captures user intent but discards temporal order; adding simple time‑difference features works better than Position Encoding.

05 Spatiotemporal Context Adaptive Model

We introduce the Dynamic Context Adaptation Model (DCAM), which inserts adaptive modules to select spatiotemporal features automatically. Input and spatiotemporal features are passed through an MLP, followed by a sigmoid activation to produce feature weights. Top‑K (K=4, determined by grid search) weights are selected and used to modulate the input via Hadamard product before concatenation. The loss is cross‑entropy.

Spatiotemporal Feature Adaptation Module (StFAM) computes weights for each context feature, selects the top‑K, and integrates them with other inputs.

The final architecture combines the weighted context with user, item, and sequence features.

5.2 Offline Experiments

We evaluated DCAM on two datasets (a theme dataset and a shop dataset). Compared with state‑of‑the‑art spatiotemporal models, DCAM consistently outperforms them in AUC.

5.3 Online Experiments

Live A/B tests show that DCAM yields a stable click‑through‑rate improvement over other SOTA models.

06 Summary

This work fills a gap in spatiotemporal research for local‑life recommendation. By systematically exploring spatiotemporal features and sequences, we propose DCAM, an adaptive model that selects useful context features and demonstrates superior offline and online performance. Future work includes systematic study of feature cross‑interaction and broader applicability.

Additional related explorations are listed with links to earlier blog posts.

Machine Learningrecommendationcontext adaptationDCAMonline food orderingSpatiotemporal
Ele.me Technology
Written by

Ele.me Technology

Creating a better life through technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.