Artificial Intelligence 33 min read

Controllable Text Generation: Models, Techniques, and Real-World Applications

This comprehensive article surveys controllable text generation, covering core NLP concepts, model architectures, evaluation metrics, four main control strategies, recent research trends, and a practical e‑commerce query‑generation case study.

Alibaba Cloud Developer

Sep 26, 2021

Controllable Text Generation: Models, Techniques, and Real-World Applications

Text Generation Technology

Text generation (Text Generation) is a core NLP task aiming to produce natural‑language sequences from various inputs such as structured data (Data‑to‑text), images (Image Caption), video (Video Summarization), audio (Speech Recognition), etc. This chapter focuses on Text‑to‑Text tasks including neural machine translation, QA, and abstractive summarization.

With the rise of deep learning, attention, copy mechanisms, RNN, CNN, GNN, and Transformer architectures have been adopted. Large pre‑trained language models (PLMs) trained on massive corpora are now widely used for text generation.

Text Generation Model Structures

Typical model families can be grouped as:

Encoder‑Decoder Framework

Auto‑regressive Language Model

Hierarchical Encoder‑Decoder

Knowledge‑Enriched Model

Write‑then‑Edit Framework

Figure 1 illustrates these structures.

Various text generation model structures

Evaluation Metrics for Text Generation

Metrics are divided into human‑centric (fluency, coherence, factuality, etc.) and automatic metrics. Unsupervised metrics include ROUGE‑N, BLEU‑N, Distinct‑N. Machine‑learned metrics use pretrained discriminators such as BERTScore, GeDi‑based toxicity classifiers, or textual entailment models.

Performance trade‑offs of evaluation metrics

Controllable Text Generation

The goal is to steer a model to generate text with specific attributes (style, topic, sentiment, length, etc.). Four main solution families are presented:

1. Prompt Design

Prompts reformulate downstream tasks as the pre‑training objective (e.g., masked language modeling for classification, prefix‑style prompts for generation). Examples include BERT entity typing, T5 task prefixes, and GPT‑2 task tokens.

2. Control Codes

Control codes are special tokens or text segments that condition the model. Examples: GSum’s sentence/keyword/entity‑triple signals, CTRLsum’s length‑bucket keywords.

3. Decoding Strategies

Modifying the decoding phase (Beam Search, temperature scaling, top‑k, nucleus sampling, length penalty) can influence output length, diversity, and attribute compliance.

4. Write‑then‑Edit

Approaches such as PPLM, GeDi, and CoCon generate a draft and then refine it using attribute discriminators or additional loss functions (self‑reconstruction, null‑content, cycle‑reconstruction).

Technical Summary

Controllable generation methods aim to (1) select appropriate control signals, (2) inject them effectively into the model, and (3) ensure the signals are correlated with the target output during training. Trends move toward low‑data, low‑compute solutions that preserve PLM knowledge while adding controllability.

Case Study: Controllable Query Generation for ICBU

A practical system uses a BART‑based conditional language model to generate e‑commerce search queries conditioned on entity‑type control codes. Length‑penalty Beam Search improves brevity, and a XLNet‑based value discriminator selects high‑conversion queries. The approach outperforms extraction baselines in recall and CTR.

Datasets for Controllable Generation

StylePTB – fine‑grained style transfer

SongNet – format‑controlled Chinese poetry

GPT‑2 Output – large corpus for synthetic data

Inverse Prompting – open‑domain poetry and QA

GYAFC – formality transfer corpus

References

(Reference list omitted for brevity.)

Recruitment

The ICBU algorithm team works on search, recommendation, knowledge graph, video understanding, growth, risk control, and advertising. Interested candidates in NLP, CV, ML/DL, or combinatorial optimization can email [email protected].

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Prompt engineering Evaluation Metrics natural language processing pretrained language models controllable text generation

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.