How Smart Copy Generation Boosts 1688 B2B Sales: From Seq2Seq to Coverage Attention
This article analyzes the challenges of generating product copy for the 1688 B2B platform, proposes enhancements to attention‑based Seq2Seq and Pointer‑Generator models—including TextCNN classification, convolutional inputs, coverage attention, and beam‑search constraints—and demonstrates significant gains in accuracy, diversity, and reduced repetition through extensive experiments.
Application Scenarios
We plan to apply intelligent copy generation on the 1688 platform to three main scenarios: (1) multi‑product smart list generation, already deployed for daily business opportunity recommendations; (2) personalized subtitle generation for individualized user experiences; (3) single‑product recommendation copy generation, which is the primary focus of this work.
Related Work
Natural Language Generation (NLG) has been widely used in dialogue systems, summarization, and image captioning. Since the introduction of Seq2Seq models, many improvements such as attention mechanisms, copy mechanisms, and coverage have been proposed. Our baseline is an attention‑based Seq2Seq model.
The Pointer‑Generator network combines an attention‑based Seq2Seq model with a pointer network, allowing the model to copy words directly from the source to handle out‑of‑vocabulary terms, which is crucial for brand names and trending keywords.
The encoder uses a single‑layer bidirectional LSTM; the decoder uses a single‑layer unidirectional LSTM. Attention distributes probability over the source tokens, guiding the decoder on which words are most relevant at each step.
The context vector is obtained by weighting encoder hidden states with the attention distribution.
The final vocabulary distribution combines the Seq2Seq output and the pointer network output.
Model Issues and Improvements
1. Category Mismatch – High attention on words that appear in multiple categories (e.g., “tightening” in both cosmetics and clothing) leads to incorrect copy. We add a Text‑CNN classifier after the decoder, feeding the most probable token sequence to predict the product leaf category and applying a cross‑entropy loss during fine‑tuning.
2. Lack of Diversity – Identical attention on similar keywords produces the same copy for different titles. We enrich input features with n‑gram convolutional encoders (different kernel sizes) and merge their states with the original Bi‑LSTM encoder before the decoder.
3. Repetition – We adopt coverage attention and coverage loss (as in Tu et al.) to penalize repeatedly attending to the same source positions, but modify the coverage vector to use the mean of previous attentions to avoid unbounded growth. Additionally, we introduce a beam‑search constraint that limits repeated word generation.
Experimental Results
Accuracy – BLEU and Word Accuracy improve from the baseline Attention‑Seq2Seq (BLEU 3.09, WA 2.53) to Pointer‑Generator (BLEU 3.83, WA 3.04) and further with TextCNN (BLEU 4.52, WA 3.23) or Conv‑Inputs (BLEU 3.95, WA 3.58).
Model
BLEU
Word Accuracy
Attention Seq2Seq
3.09
2.53
Pointer-Generator
3.83
3.04
Pointer-Generator+ TextCNN
4.52
3.23
Pointer-Generator+ Conv_Inputs
3.95
3.58
Examples show that category confusion is largely resolved.
Title: “精品国货 抗皱 … 眼霜” Old model: “想要拥有大长腿,眼霜少不了” New model: “这款提拉紧致抗皱通用眼霜,可有效改善眼周老化,皮肤松弛,皱纹等问题。”
Diversity – We measure the proportion of duplicate lines in 3,000 test samples. The Line_rep_rate drops from 0.068 (Attention Seq2Seq) to 0.031 (Pointer‑Generator) and further to 0.021 (Pointer‑Generator+ Conv_Inputs).
Model
Line_rep_rate
Attention Seq2Seq
0.068
Pointer-Generator
0.031
Pointer-Generator + Conv_Inputs
0.021
Case study shows distinct copies for two similar titles after improvement.
Repetition – Coverage attention yields slight improvement, while beam‑search constraints eliminate most repetition. Convolutional inputs also help.
Conclusion and Future Work
We addressed three major pain points of intelligent copy generation on 1688 – category mismatch, lack of diversity, and generic repetition – by enhancing the baseline Seq2Seq model with pointer‑generator, TextCNN classification, convolutional inputs, coverage attention, and beam‑search constraints, achieving higher accuracy and diversity.
Future directions include improving inference speed (currently ~1 s per item with beam size 4), exploring distributed decoding, refining B‑class style generation to better serve business‑user preferences, and extending the technology to advertising copy, holiday greetings, and other scenarios.
Acknowledgments – Thanks to the CBU New Retail Algorithm team, especially mentors for their guidance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
