Can AI Generate Perfect Short Product Titles? A Multi-Source Pointer Network Approach
This article investigates the challenge of generating concise e‑commerce product short titles by formalizing it as a constrained text‑summarization task, proposes a Multi‑Source Pointer Network that leverages both title and background knowledge encoders, and demonstrates its superiority through extensive offline and online experiments.
Background
Product titles are a crucial communication medium between sellers and buyers on e‑commerce platforms. In mobile‑first environments like Taobao, overly long titles are truncated, harming user experience and click‑through rates. This work focuses on generating short product titles that retain core attributes.
Problem Formalization
Short title generation is treated as a specialized text‑summarization task with two strict constraints: (1) no introduction of irrelevant information, preserving words from the original title; (2) retention of key product information such as brand and category.
Model
Pointer Network
Pointer Networks use attention to select words directly from the encoder, avoiding out‑of‑vocabulary issues and ensuring extractive summarization.
Multi‑Source Pointer Network
The proposed MS‑Pointer adds a second encoder that processes background knowledge (brand and category). A soft gating weight λ decides at each decoding step whether to attend to the title encoder or the knowledge encoder.
The loss function combines cross‑entropy over the selected words.
Experimental Results
Dataset Construction
Training data were collected from Taobao’s "Good Goods" recommendation channel, filtered to ensure extractive short titles of about 10 characters that retain brand information. Over 500 k samples were initially gathered, later expanded to more than 5 million.
Data Processing
Most punctuation is retained to preserve brand/model tokens.
All digits appearing in brand or model names are kept.
Characters are processed without word segmentation.
Baseline & Model Settings
Baselines include simple truncation, TextRank, deletion‑based seq2seq, standard Pointer Network, and various concatenation or generation variants. All models use LSTM encoders/decoders with hyper‑parameters detailed in the original paper.
Evaluation Metrics
Automatic metrics (BLEU, ROUGE, METEOR) and human evaluation (accuracy of core product words, category completeness, readability, and information completeness) were employed.
Brand Retention Experiment
MS‑Pointer achieved a brand‑error rate of less than 0.1 % on the test set, outperforming all baselines.
Human Evaluation
300 randomly sampled short titles were judged on four dimensions; MS‑Pointer scores were comparable to manually written titles.
Online Experiment
A/B test comparing truncated two‑line titles with generated 8‑10‑character titles showed a significant CTR lift, especially for electronics where brand and model suffice.
References
Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A Neural Attention Model for Abstractive Sentence Summarization. EMNLP.
Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. NAACL.
Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer‑Generator Networks. ACL.
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. NIPS.
Katja Filippova et al. 2015. Sentence Compression by Deletion with LSTMs. EMNLP.
Hongyan Jing. 2002. Using Hidden Markov Modeling to Decompose Human‑written Summaries. Computational Linguistics.
Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Texts. EMNLP.
Kishore Papineni et al. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. ACL.
Chin‑Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries.
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. ACL Workshop.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
