Artificial Intelligence 7 min read

How to Run GPT‑2 Locally: Complete Setup and Code Adjustments

This guide explains the GPT‑2 background, required software, environment configuration, code modifications for TensorFlow 2.x, data download, execution commands, and sample test results, providing a full step‑by‑step process for local deployment of the model.

WeiLi Technology Team
WeiLi Technology Team
WeiLi Technology Team
How to Run GPT‑2 Locally: Complete Setup and Code Adjustments

Background

GPT‑2 is the 2018 OpenAI pre‑trained language model that expands the original GPT corpus to over 40 GB of web text, roughly ten times larger than its predecessor.

Related documents:

https://openai.com/blog/better-language-models/

http://jalammar.github.io/illustrated-gpt2/

https://github.com/openai/gpt-2

Local Startup Process

1. Download Code and Software

Code repository: https://github.com/openai/gpt-2

PyCharm (Community edition): https://www.jetbrains.com/zh-cn/pycharm/download/#section=windows

Miniconda (64‑bit): https://docs.conda.io/en/latest/miniconda.html

Protoc 3.19.0: https://github.com/protocolbuffers/protobuf/releases/download/v3.19.0/protoc-3.19.0-win64.zip

2. Version Adjustments

GPT‑2 originally uses TensorFlow 1.12.0, which is no longer available; upgrade to TensorFlow 2.x (tested with Python 3.8 and TensorFlow 2.6.0).

3. Code Adjustments

Modify the source files to be compatible with TensorFlow 2.x:

<code># Compatibility import
import tensorflow._api.v2.compat.v1 as tf
tf.disable_v2_behavior()

# Replace deprecated HParams
from easydict import EasyDict as edict
</code>

Change TensorFlow calls from

tf.

to

tf.compat.v1.

.

<code># Before
import tensorflow as tf
# After
import tensorflow._api.v2.compat.v1 as tf
tf.disable_v2_behavior()
</code>

Update

model.py

default hyper‑parameters:

<code># Before
def default_hparams():
    return HParams(
        n_vocab=0,
        n_ctx=1024,
        n_embd=768,
        n_head=12,
        n_layer=12,
    )
# After
def default_hparams():
    return edict(
        n_vocab=50257,
        n_ctx=1024,
        n_embd=768,
        n_head=12,
        n_layer=12,
    )
# Remove or comment out JSON loading code
with open(os.path.join(models_dir, model_name, 'hparams.json')) as f:
    hparams.override_from_dict(json.load(f))
</code>

4. Environment Setup

Set Python version to 3.8 (e.g., using Conda).

Install required packages:

<code>pip3 install tensorflow==2.6.0
pip3 install fire>=0.1.3
pip3 install regex==2022.3.15
pip3 install requests==2.21.0
pip3 install tqdm==4.31.1
pip3 install numpy==1.19.5
</code>

5. Download Model Data

<code>python download_model.py 124M
python download_model.py 355M
python download_model.py 774M
python download_model.py 1558M
</code>

Adjust

generate_unconditional_samples.py

or

interactive_conditional_samples.py

to point to the desired model size and modify

model.py

hyper‑parameters according to the downloaded

hparams.json

(see screenshots).

6. Running the Model

<code>python src/generate_unconditional_samples.py | tee /tmp/samples
# or
python src/interactive_conditional_samples.py --top_k 40
</code>

Sample output is shown below:

Local Tests

Examples of generated responses:

Test 1 – "who are you?"

Test 2 – "1+1="

Test 3 – "请用中文写个小故事"

The local deployment does not always produce satisfactory results because of data quality and quantity issues in the 1558M corpus.

Conclusion

GPT‑2 is a powerful NLP model built on large‑scale data and deep learning, capable of generating fluent and diverse dialogues. However, it requires substantial computational resources and massive datasets to achieve high‑quality generation.

pythonAIdeep learningTensorFlowLocal DeploymentGPT-2
WeiLi Technology Team
Written by

WeiLi Technology Team

Practicing data-driven principles and believing technology can change the world.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.