Fine‑Tune a Language Model for Band Trivia with Hugging Face PEFT
This tutorial walks through installing Python dependencies, preparing a JSON‑based QA dataset, and using Hugging Face's PEFT library to fine‑tune a small FLAN‑T5 model so it can answer questions about AC/DC and other bands without passing knowledge at inference time.
Prerequisites
You need Python 3 (≥3.9) and a virtual environment. Install the required packages:
mkdir my-project
cd my-project
python3 -m venv venv
source venv/bin/activate
pip install transformers
pip install torch
pip install datasets
pip install 'transformers[torch]'
pip install peftAn optional requirements.txt file with the exact versions is provided in the appendix.
Starting point
Create run-questions.py and paste the following code to query a model with a knowledge prompt:
from transformers import pipeline
qa = pipeline("text2text-generation", model="google/flan-t5-small")
question = "When was ACDC formed?"
knowledge = """
ACDC is the name of a band that was formed in Sydney in 1973.
The members of the band include Malcolm as the rhythm guitarist and Angus as the lead guitarist.
"""
result = qa("Context: " + knowledge + " Question: " + question)
print(result)Running this script prints the correct answer "1973" because the knowledge is supplied as context.
Fine‑tuning
Instead of passing knowledge at inference, we embed it into the model by fine‑tuning with PEFT.
Training data
Prepare a JSONL file acdc_qa.json where each line is a {"question": ..., "answer": ...} object, e.g.:
{"question": "When was ACDC formed?", "answer": "1973"}
{"question": "What year was ACDC formed?", "answer": "1973"}
{"question": "What is the name of the band that was formed in Sydney in 1973?", "answer": "ACDC"}
{"question": "Where was ACDC formed?", "answer": "Sydney"}
{"question": "Who are the members of ACDC?", "answer": "Malcolm Young and Angus Young"}
{"question": "What role does Malcolm play in ACDC?", "answer": "rhythm guitarist"}
{"question": "What role does Angus play in ACDC?", "answer": "lead guitarist"}Generate additional questions (e.g., with GitHub Copilot, ChatGPT, or Claude) for other bands such as Cold Chisel and Foo Fighters to reach about 135 QA pairs.
Training script
Create run-trainer.py with the following imports and preprocessing logic:
from datasets import load_dataset
from transformers import AutoTokenizer, Trainer, AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments
from peft import LoraConfig, TaskType, get_peft_model
model_name = "google/flan-t5-small"
tokenizer = AutoTokenizer.from_pretrained(model_name)
dataset = load_dataset("json", data_files="acdc_qa.json")
def preprocess(example):
inputs = tokenizer(example["question"], max_length=128, truncation=False, padding="max_length")
targets = tokenizer(example["answer"], max_length=128, truncation=False, padding="max_length")
inputs["labels"] = targets["input_ids"]
return inputs
tokenized_dataset = dataset.map(preprocess)
tokenized_dataset.set_format("torch", columns=["input_ids", "attention_mask", "labels"])
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
peft_config = LoraConfig(
task_type=TaskType.SEQ_2_SEQ_LM,
inference_mode=False,
r=8,
lora_alpha=32,
lora_dropout=0.1,
)
model = get_peft_model(model, peft_config)
training_args = Seq2SeqTrainingArguments(
output_dir="./acdc-finetuned-model",
per_device_train_batch_size=8,
num_train_epochs=100,
logging_steps=1,
push_to_hub=False,
learning_rate=1e-3,
eval_strategy="epoch",
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset['train'],
eval_dataset=tokenized_dataset['train'].select(range(20)),
)
trainer.train()
model.save_pretrained("./acdc-finetuned-model")
tokenizer.save_pretrained("./acdc-finetuned-model")Adjust num_train_epochs as needed; at least 100 epochs were required for the model to start answering AC/DC questions reliably.
Using the fine‑tuned model
Replace the pipeline in run-questions.py with the locally saved model:
from transformers import pipeline
qa_pipeline = pipeline("text2text-generation", model="./acdc-finetuned-model", tokenizer="./acdc-finetuned-model")
questions = [
"When was ACDC formed?",
"Where was ACDC formed?",
"List the members of Cold Chisel.",
"List the members of ACDC.",
]
for question in questions:
answer = qa_pipeline(question)
print(f"{question} Answer: {answer[0]['generated_text']}")The fine‑tuned model now answers band‑related queries without needing an explicit knowledge context.
Summary
This guide demonstrates how to fine‑tune a large language model on a specific domain (band trivia) using Hugging Face’s PEFT library, thereby reducing the token count needed at inference and avoiding repeated context prompts. While the example uses a tiny knowledge base, the same workflow scales to gigabytes of documentation, making it valuable for building cost‑effective, domain‑specific QA systems.
Appendix – requirements.txt
accelerate==1.9.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aiosignal==1.4.0
attrs==25.3.0
certifi==2025.7.14
charset-normalizer==3.4.2
datasets==4.0.0
dill==0.3.8
filelock==3.18.0
frozenlist==1.7.0
fsspec==2025.3.0
hf-xet==1.1.5
huggingface-hub==0.34.3
idna==3.10
Jinja2==3.1.6
MarkupSafe==3.0.2
mpmath==1.3.0
multidict==6.6.3
multiprocess==0.70.16
networkx==3.5
numpy==2.3.2
packaging==25.0
pandas==2.3.1
peft==0.16.0
propcache==0.3.2
psutil==7.0.0
pyarrow==21.0.0
python-dateutil==2.9.0.post0
pytz==2025.2
PyYAML==6.0.2
regex==2025.7.34
requests==2.32.4
safetensors==0.5.3
setuptools==80.9.0
six==1.17.0
sympy==1.14.0
tokenizers==0.21.4
torch==2.7.1
tqdm==4.67.1
transformers==4.55.0
typing_extensions==4.14.1
tzdata==2025.2
urllib3==2.5.0
xxhash==3.5.0
yarl==1.20.1Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
