Artificial Intelligence 20 min read

AdaScene: Adaptive Scenario Modeling for Multi‑Scene Recommendation in Meituan DSP

AdaScene introduces an adaptive scenario modeling framework for Meituan’s DSP that mitigates negative transfer and extreme data sparsity across heterogeneous display scenes by employing a knowledge‑transfer network with scene‑specific feature adaptation and gated expert sharing, alongside a gradient‑based scene‑aggregation process that clusters similar scenarios, yielding consistent performance gains for both high‑traffic and low‑traffic channels.

Meituan Technology Team

Sep 14, 2023

AdaScene: Adaptive Scenario Modeling for Multi‑Scene Recommendation in Meituan DSP

1 Introduction

Meituan Delivery DSP (Demand‑Side Platform) is responsible for recommending and delivering material on external media. Rapid business expansion brings a large number of heterogeneous display scenarios (e.g., splash screen, interstitial, feed, pop‑up). This diversity leads to two major problems: (1) negative transfer caused by the long‑tail distribution of traffic across channels, and (2) extreme data sparsity for each fine‑grained scenario, which hampers model convergence.

2 Adaptive Scenario Modeling

Multi‑scene modeling follows the classic Embedding + Mixture‑of‑Experts (MoE) paradigm, where the overall loss is the weighted sum of losses from all scenarios. To address the above challenges, we propose AdaScene, which consists of two complementary modules: a knowledge‑transfer module and a scene‑aggregation module.

2.1 Adaptive Scenario Knowledge Transfer

We design an Adaptive Knowledge Transfer Network (AKTN). The network contains:

Scene Feature Adaptation : a Squeeze‑and‑Excitation‑based Scene Adaptation Layer that learns per‑scenario weights for raw features, allowing the model to focus on the most informative features for each scenario.

Scene Knowledge Transfer : a GRU‑based Scene Transfer Layer that gates the flow of information between global experts and scenario‑specific experts, thereby controlling the degree of knowledge sharing and mitigating negative transfer.

In addition, we replace the deterministic expert selection with a sparse expert network that automatically chooses K experts for each scenario via a differentiable gating mechanism. Experiments (Table 1) show consistent gains on both head and tail channels, especially for low‑traffic scenarios.

2.2 Adaptive Scenario Aggregation

Because each fine‑grained scenario is extremely sparse, we cluster scenes to form a manageable number of scene groups. We adopt a two‑stage strategy:

Stage 1 : compute pairwise similarity between scenes (using gradient‑based metrics) and find a grouping that maximizes overall similarity.

Stage 2 : fine‑tune the model on the grouped scenes with a cross‑entropy loss.

We explore three similarity‑estimation methods:

Gradient Regulation : add a regularization term based on the distance between gradients of scene‑specific losses.

Lookahead Strategy : update shared parameters with the gradient of one scene and measure the gain on other scenes to capture asymmetric influence.

Meta Weights : learn a scene‑wise correlation matrix (meta‑weights) and optimize it with a MAML‑style bi‑level objective.

Extensive offline experiments (Tables 2‑5, Figures 5‑9) demonstrate that gradient‑based aggregation consistently outperforms rule‑based grouping, and that increasing the number of scene groups improves GAUC up to a saturation point.

3 Conclusion and Outlook

We have presented AdaScene, an adaptive scenario modeling framework that alleviates negative transfer and data sparsity in large‑scale recommendation systems. Future work includes exploring finer scene partitioning (e.g., media × position × time) and developing end‑to‑end traffic‑aware aggregation methods.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI multi-scenario recommendation Knowledge Transfer adaptive modeling gradient regulation sparse expert network

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.