Mastering LLM Text Generation: Decoding Methods Explained

This review of the recent MindSpore NLP public class walks through the fundamentals of large language model text generation, detailing deterministic decoding such as greedy and beam search, stochastic sampling techniques like temperature, top‑k and top‑p, and advanced methods including constrained beam, contrastive, and assisted search, with illustrative examples.

Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Mastering LLM Text Generation: Decoding Methods Explained

Course Review

After studying GLM large models, this week's public class analyzed the core capability of large language models—text generation—introducing various decoding methods with demos using MindSpore NLP.

1. Autoregressive Language Model Principle

Text sequence probability can be factorized as the product of conditional probabilities of each token given the preceding context.

Probability factorization
Probability factorization

2. Deterministic Decoding (Search Methods)

Greedy Search

Select the token with the highest probability at each step.

Advantage: fastest.

Disadvantage: may miss high‑probability words hidden behind low‑probability tokens.

Greedy search example
Greedy search example

Beam Search

Keeps the top‑num_beams candidates at each step and selects the highest‑probability sequence, reducing the risk of losing potential high‑probability paths.

Advantage: retains optimal paths to some extent.

Disadvantage: cannot solve repetition issues; performs poorly in open‑domain generation.

Beam search example
Beam search example

3. Stochastic Decoding (Sampling Methods)

Randomly choose the next token according to the current conditional probability distribution.

Temperature

Increases the likelihood of high‑probability words and decreases that of low‑probability ones; larger values yield more randomness.

Temperature sampling example
Temperature sampling example

Top‑K

Select the K most probable tokens, renormalize their probabilities, and sample from this reduced set.

Drawbacks: can produce gibberish when the distribution is sharp; limits creativity when the distribution is flat.

Top‑K sampling example
Top‑K sampling example

Top‑P (Nucleus Sampling)

Sample from the smallest set of tokens whose cumulative probability exceeds p, then renormalize.

Advantage: the sampling pool dynamically adjusts to the probability distribution.

Top‑P sampling example
Top‑P sampling example

4. Other Decoding Methods

Constrained Beam Search : Inserts fixed word combinations into beam search candidates to achieve desired content.

Contrastive Search : Adds a penalty term based on similarity with previous context.

Assisted Search : Uses a small assistant model to predict or decide whether to cache text, reducing the number of LLM calls.

Advanced decoding methods illustration
Advanced decoding methods illustration
LLMSamplingBeam Searchtext generationdecoding methodsGreedy SearchMindSpore
Huawei Cloud Developer Alliance
Written by

Huawei Cloud Developer Alliance

The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.