How Temperature Shapes Output in Large Language Models

The article explains the Temperature hyper‑parameter in large language models, shows how it modifies the softmax distribution, provides a Python visualisation script, and demonstrates through experiments that higher values increase creativity while lower values make outputs more deterministic.

AI Algorithm Path
AI Algorithm Path
AI Algorithm Path
How Temperature Shapes Output in Large Language Models

Definition of Temperature

Temperature is a hyper‑parameter used in large language models such as ChatGPT, GPT‑3, GPT‑3.5, GPT‑4, and LLaMA to adjust the model’s confidence in its most likely responses.

Principle Explanation

When a model predicts the next token, it first produces raw scores z_i. These scores are turned into probabilities with the softmax function. Introducing a Temperature variable θ modifies the softmax as follows:

Dividing each logit by θ means a higher Temperature ( θ) lifts low‑probability tokens, while a lower Temperature suppresses them, making the distribution sharper.

Experimental Results

To visualise this effect, the following Python code plots the adjusted probabilities for a set of example token scores.

import math
import matplotlib.pyplot as plt

def plot_with_temperature(name_list, value_list, temperature):
    tmp_list = [math.pow(math.e, x/temperature) for x in value_list]
    sum_value = sum(tmp_list)
    out_list = [x / sum_value for x in tmp_list]
    plt.bar(name_list, out_list)
    plt.show()
    pass

if __name__ == "__main__":
    name_list = ["cat","cheese","pizza","cookie","fondue","banana","baguette","cake"]
    value_list = [3, 70, 40, 65, 55, 10, 15, 12]
    plot_with_temperature(name_list, value_list, temperature=1)
    plot_with_temperature(name_list, value_list, temperature=10)
    plot_with_temperature(name_list, value_list, temperature=50)
    plot_with_temperature(name_list, value_list, temperature=100)
    plot_with_temperature(name_list, value_list, temperature=1000)

The script is run with temperatures 1, 10, 50, 100 and 1000. The resulting bar charts are shown below.

Conclusion

Observing the charts, higher Temperature values produce flatter distributions, allowing the model to generate more diverse and creative text—useful for prose generation. Lower Temperature values concentrate probability on the top token, yielding more deterministic outputs—ideal for question‑answering scenarios.

Pythonlarge language modelsSamplingVisualizationTemperatureSoftmax
AI Algorithm Path
Written by

AI Algorithm Path

A public account focused on deep learning, computer vision, and autonomous driving perception algorithms, covering visual CV, neural networks, pattern recognition, related hardware and software configurations, and open-source projects.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.