How to Run GPT‑2 Locally: Complete Setup and Code Adjustments
This guide explains the GPT‑2 background, required software, environment configuration, code modifications for TensorFlow 2.x, data download, execution commands, and sample test results, providing a full step‑by‑step process for local deployment of the model.
Background
GPT‑2 is the 2018 OpenAI pre‑trained language model that expands the original GPT corpus to over 40 GB of web text, roughly ten times larger than its predecessor.
Related documents:
https://openai.com/blog/better-language-models/
http://jalammar.github.io/illustrated-gpt2/
https://github.com/openai/gpt-2
Local Startup Process
1. Download Code and Software
Code repository: https://github.com/openai/gpt-2
PyCharm (Community edition): https://www.jetbrains.com/zh-cn/pycharm/download/#section=windows
Miniconda (64‑bit): https://docs.conda.io/en/latest/miniconda.html
Protoc 3.19.0: https://github.com/protocolbuffers/protobuf/releases/download/v3.19.0/protoc-3.19.0-win64.zip
2. Version Adjustments
GPT‑2 originally uses TensorFlow 1.12.0, which is no longer available; upgrade to TensorFlow 2.x (tested with Python 3.8 and TensorFlow 2.6.0).
3. Code Adjustments
Modify the source files to be compatible with TensorFlow 2.x:
<code># Compatibility import
import tensorflow._api.v2.compat.v1 as tf
tf.disable_v2_behavior()
# Replace deprecated HParams
from easydict import EasyDict as edict
</code>Change TensorFlow calls from
tf.to
tf.compat.v1..
<code># Before
import tensorflow as tf
# After
import tensorflow._api.v2.compat.v1 as tf
tf.disable_v2_behavior()
</code>Update
model.pydefault hyper‑parameters:
<code># Before
def default_hparams():
return HParams(
n_vocab=0,
n_ctx=1024,
n_embd=768,
n_head=12,
n_layer=12,
)
# After
def default_hparams():
return edict(
n_vocab=50257,
n_ctx=1024,
n_embd=768,
n_head=12,
n_layer=12,
)
# Remove or comment out JSON loading code
with open(os.path.join(models_dir, model_name, 'hparams.json')) as f:
hparams.override_from_dict(json.load(f))
</code>4. Environment Setup
Set Python version to 3.8 (e.g., using Conda).
Install required packages:
<code>pip3 install tensorflow==2.6.0
pip3 install fire>=0.1.3
pip3 install regex==2022.3.15
pip3 install requests==2.21.0
pip3 install tqdm==4.31.1
pip3 install numpy==1.19.5
</code>5. Download Model Data
<code>python download_model.py 124M
python download_model.py 355M
python download_model.py 774M
python download_model.py 1558M
</code>Adjust
generate_unconditional_samples.pyor
interactive_conditional_samples.pyto point to the desired model size and modify
model.pyhyper‑parameters according to the downloaded
hparams.json(see screenshots).
6. Running the Model
<code>python src/generate_unconditional_samples.py | tee /tmp/samples
# or
python src/interactive_conditional_samples.py --top_k 40
</code>Sample output is shown below:
Local Tests
Examples of generated responses:
Test 1 – "who are you?"
Test 2 – "1+1="
Test 3 – "请用中文写个小故事"
The local deployment does not always produce satisfactory results because of data quality and quantity issues in the 1558M corpus.
Conclusion
GPT‑2 is a powerful NLP model built on large‑scale data and deep learning, capable of generating fluent and diverse dialogues. However, it requires substantial computational resources and massive datasets to achieve high‑quality generation.
WeiLi Technology Team
Practicing data-driven principles and believing technology can change the world.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.