Artificial Intelligence 15 min read

Intelligent Generation of Search Engine Advertising Keywords: Methods, Frameworks, and Future Directions

This article presents a comprehensive overview of automated techniques for generating high‑quality search engine advertising keywords, covering background, traditional manual methods, intelligent keyword expansion using NLP, segmentation, POS tagging, BILSTM‑CRF, BERT classification, semantic matching with DSSM, and additional approaches such as query suggestion and synonym rewriting.

Ctrip Technology

Apr 30, 2020

Intelligent Generation of Search Engine Advertising Keywords: Methods, Frameworks, and Future Directions

The rapid international expansion of Ctrip has led to extensive overseas search‑engine advertising, where effective keyword selection, pricing, and creative design are crucial for improving ROI.

Traditional keyword generation relies on manual brainstorming (expansion) and mining existing search queries (catching), both of which are labor‑intensive and lack fine‑grained targeting.

To address these issues, an intelligent keyword generation pipeline is proposed, consisting of three core modules:

1. Product Information Supply Module – Stores product data (e.g., hotel, flight, city accommodation) and performs cleaning, tokenization, and part‑of‑speech (POS) tagging. Ambiguities in geographic entities are resolved using a Geohash‑based structured dictionary, while insufficient dictionary coverage is mitigated with data augmentation and a BILSTM‑CRF model.

2. Search Habit Summarization Module – Analyzes user search queries to extract common search patterns. It employs named‑entity recognition, tokenization, and POS tagging to map queries such as “Shanghai hotel accommodation” or “Hongqiao Airport inn discount” to structured entities (city, hotel name, demand terms).

3. Keyword Generation Module – Generates candidate keywords from product data and the derived rules, then filters ambiguous terms using three strategies: (a) string‑match overlap, (b) click‑through distribution across multiple products, and (c) semantic similarity scores computed by a DSSM model.

For the “catching” (keyword mining) scenario, a binary classification model fine‑tuned on BERT determines whether a query is accommodation‑related. Subsequent intent recognition treats the problem as semantic matching between query and product, using a two‑stage approach: offline DSSM recall followed by BERT‑based re‑ranking.

Additional explored methods include:

• Query‑suggestion based keyword generation, leveraging popularity, relevance, and diversity of suggested queries.

• Synonym rewriting techniques such as query rewriting, click‑graph based rewriting, and grammatical substitution to expand high‑performing keywords.

The article concludes with future work focusing on extending the system to more languages beyond Chinese, English, Japanese, and Korean, and investigating fully machine‑understood keyword generation that may produce non‑interpretable yet effective terms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NLP BERT semantic matching search advertising BILSTM-CRF keyword generation

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.