Artificial Intelligence 10 min read

Search Advertising and Ad Recall: Business Logic, Semantic Relevance, and Deep Learning Models at 360

This article explains the architecture of 360's search advertising system, detailing its ad recall, ranking, and display modules, illustrates exact‑match and semantic recall methods with a case study, and reviews the evolution from feature‑engineered GBDT models to deep learning approaches such as DSSM, ESIM, and BERT, including data preparation, training, and performance evaluation.

DataFunTalk

Jul 16, 2019

Search Advertising and Ad Recall: Business Logic, Semantic Relevance, and Deep Learning Models at 360

360's search advertising platform is divided into three logical modules: ad recall, ad ranking, and ad display. The recall module decides which ads to fetch based on index and relevance calculations; the ranking module estimates click‑through rate and runs the bidding mechanism; the display module selects the creative to show.

A concrete case demonstrates the workflow: two e‑commerce advertisers submit keywords and bids for a vacuum cleaner; the recall module matches the query to both ads; the ranking module computes CTR and quality scores to determine eligibility and position; finally, the display module chooses the appropriate creative.

Recall operates via two strategies. Exact‑match recall performs string matching, including tokenization and re‑ordering, to retrieve ads whose keywords precisely align with the query. Semantic recall uses a lookup table generated by an offline mining funnel that combines various data‑mining techniques (random walk, text retrieval, etc.) to produce candidate query‑bidword pairs, which are then filtered by an online relevance module.

The second part of the talk covers semantic relevance and deep learning. Early methods relied on feature engineering plus GBDT, using text similarity, embedding similarity, BM25, and other search engine features, but struggled with true semantic understanding. Subsequent deep models include:

DSSM (2013): encodes queries and documents independently with FNN/CNN/RNN and computes similarity via cosine and sigmoid.

ESIM (2016): two‑layer bidirectional LSTM with soft attention, widely used in QA and customer service.

BERT (2018): pre‑training and fine‑tuning transformer encoder achieving state‑of‑the‑art semantic similarity.

Model parameters: DSSM and ESIM each have ~2 M parameters, while BERT exceeds 100 M due to 12 transformer layers with 12‑head attention; an extended version adds four more layers, reaching ~130 M parameters.

Data preparation involves two datasets: a large set (~11 M samples) derived from ad click logs with heuristic filtering, and a small, high‑quality set (~150 k samples) manually labeled. The large set is used for initial training or fine‑tuning, followed by a second round on the small set (4/5 for training, 1/5 for evaluation).

Performance is measured by AUC. DSSM and ESIM achieve similar AUCs, with ESIM slightly better; BERT reaches 86 % AUC. Adding BERT embeddings to a tree‑boosted model with additional features pushes AUC to 87 %, but the added complexity led to the decision to deploy BERT alone for relevance service.

The offline mining pipeline follows a funnel: extract pairs from logs or text retrieval, predict relevance, predict CTR, and finally publish the filtered results to an online KV system, which powers the semantic recall module.

Overall, 360's ad recall combines exact‑match and semantic‑based strategies, leveraging a sophisticated offline mining process and deep learning models to deliver high‑quality ad relevance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

DSSM BERT search advertising semantic relevance ad recall eSIM

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.