How Multi‑Source Pointer Networks Transform E‑Commerce Product Title Generation
This article presents a multi‑source pointer network approach for generating concise, brand‑preserving product short titles in e‑commerce, detailing problem formalization, model architecture, extensive offline and online experiments, and demonstrating significant improvements over traditional truncation and seq2seq baselines.
Product titles are a crucial communication medium between sellers and buyers on e‑commerce platforms such as Taobao. Long, SEO‑optimized titles often exceed display limits on mobile devices, leading to truncated views that degrade user experience and provide noisy signals for recommendation systems.
Background
Typical C2C product titles are overly long (average ~30 characters) and contain redundant keywords for search ranking, causing two main issues: (1) titles are truncated on mobile screens, forcing users to click into detail pages for the full title; (2) many redundant words act as noise, confusing recommendation matching and user decisions.
The goal is to generate short titles that retain core product attributes, improve click‑through rates, and enhance conversion.
Problem Formalization
Short title generation is treated as a specialized text summarization task with two strict constraints: (1) no introduction of unrelated information—short titles must reuse words from the original title; (2) essential product information such as brand and category must be preserved.
Model
We propose a Multi‑Source Pointer Network (MS‑Pointer) built on the Pointer Network framework. Two encoders are used: a title encoder for the original long title and a knowledge encoder that encodes background information (brand, category). A gating mechanism computes a soft weight λ at each decoding step to decide whether to attend to the title encoder or the knowledge encoder, allowing the model to explicitly satisfy the two constraints.
Pointer Networks select words directly from the input via attention, avoiding out‑of‑vocabulary issues and ensuring extractive summaries.
Experiments
Data were collected from the “Youhao” recommendation column on Taobao, filtered to retain only extractive short titles of ~10 characters that include brand information. Over 500 k training samples were assembled, with an 80/10/10 split for training, validation, and testing.
Baseline methods include simple truncation, TextRank, deletion‑based seq2seq models, vanilla seq2seq, Pointer‑Generator, and concatenated knowledge‑title models. Evaluation metrics comprise BLEU, ROUGE‑F1, METEOR, as well as human judgments on accuracy, completeness, readability, and information completeness.
Results show that MS‑Pointer outperforms all baselines on automatic metrics and achieves the lowest brand‑retention error rate (~0.1%). Online A/B tests measuring CTR demonstrate that generated short titles increase click‑through rates, especially for electronics where brand and model information dominate.
Conclusion
The MS‑Pointer effectively generates concise, brand‑preserving product titles, improving both offline evaluation scores and online user engagement, and highlights the importance of incorporating domain‑specific knowledge in neural summarization models.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
