Can AI Generate Perfect Short Product Titles? A Multi-Source Pointer Network Approach

This article investigates the challenge of generating concise e‑commerce product short titles by formalizing it as a constrained text‑summarization task, proposes a Multi‑Source Pointer Network that leverages both title and background knowledge encoders, and demonstrates its superiority through extensive offline and online experiments.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Can AI Generate Perfect Short Product Titles? A Multi-Source Pointer Network Approach

Background

Product titles are a crucial communication medium between sellers and buyers on e‑commerce platforms. In mobile‑first environments like Taobao, overly long titles are truncated, harming user experience and click‑through rates. This work focuses on generating short product titles that retain core attributes.

Problem Formalization

Short title generation is treated as a specialized text‑summarization task with two strict constraints: (1) no introduction of irrelevant information, preserving words from the original title; (2) retention of key product information such as brand and category.

Model

Pointer Network

Pointer Networks use attention to select words directly from the encoder, avoiding out‑of‑vocabulary issues and ensuring extractive summarization.

Multi‑Source Pointer Network

The proposed MS‑Pointer adds a second encoder that processes background knowledge (brand and category). A soft gating weight λ decides at each decoding step whether to attend to the title encoder or the knowledge encoder.

The loss function combines cross‑entropy over the selected words.

Experimental Results

Dataset Construction

Training data were collected from Taobao’s "Good Goods" recommendation channel, filtered to ensure extractive short titles of about 10 characters that retain brand information. Over 500 k samples were initially gathered, later expanded to more than 5 million.

Data Processing

Most punctuation is retained to preserve brand/model tokens.

All digits appearing in brand or model names are kept.

Characters are processed without word segmentation.

Baseline & Model Settings

Baselines include simple truncation, TextRank, deletion‑based seq2seq, standard Pointer Network, and various concatenation or generation variants. All models use LSTM encoders/decoders with hyper‑parameters detailed in the original paper.

Evaluation Metrics

Automatic metrics (BLEU, ROUGE, METEOR) and human evaluation (accuracy of core product words, category completeness, readability, and information completeness) were employed.

Brand Retention Experiment

MS‑Pointer achieved a brand‑error rate of less than 0.1 % on the test set, outperforming all baselines.

Human Evaluation

300 randomly sampled short titles were judged on four dimensions; MS‑Pointer scores were comparable to manually written titles.

Online Experiment

A/B test comparing truncated two‑line titles with generated 8‑10‑character titles showed a significant CTR lift, especially for electronics where brand and model suffice.

References

Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A Neural Attention Model for Abstractive Sentence Summarization. EMNLP.

Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive Sentence Summarization with Attentive Recurrent Neural Networks. NAACL.

Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer‑Generator Networks. ACL.

Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. NIPS.

Katja Filippova et al. 2015. Sentence Compression by Deletion with LSTMs. EMNLP.

Hongyan Jing. 2002. Using Hidden Markov Modeling to Decompose Human‑written Summaries. Computational Linguistics.

Rada Mihalcea and Paul Tarau. 2004. TextRank: Bringing Order into Texts. EMNLP.

Kishore Papineni et al. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. ACL.

Chin‑Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries.

Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. ACL Workshop.

e-commercetext summarizationpointer networkextractive summarizationproduct title generation
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.