Fine‑Tune Your Own Large Model in 5 Minutes Without Writing Code (Using LLaMA‑Factory on Qwen)
This guide walks you through fine‑tuning a large language model without any coding by using LLaMA‑Factory, covering LoRA fundamentals, environment setup, dataset creation, parameter configuration, training, loss monitoring, model export, and a quick evaluation on the Qwen2.5‑0.5B model.
Large language models such as DeepSeek and Qwen have strong general capabilities but often underperform on specialized tasks. Fine‑tuning with domain‑specific data adapts a generalist model into a domain expert.
Fine‑tuning principle
Full‑parameter fine‑tuning updates every weight and requires large compute. Efficient methods such as LoRA add a small set of trainable matrices (rank‑reduced up‑ and down‑projection) while keeping the original weights frozen, reducing memory and compute.
LLaMA‑Factory overview
LLaMA‑Factory (https://github.com/hiyouga/LLaMA-Factory) is an open‑source low‑code framework that supports over 100 models, provides a web UI, multiple dataset formats, and fine‑tuning algorithms including LoRA and DPO.
Environment preparation
Install the CUDA driver, a GPU‑enabled PyTorch build, and other dependencies. Anaconda is recommended to avoid package conflicts.
Verify GPU availability:
nvidia-smi
nvcc -vIf the commands do not show the expected output, download the appropriate CUDA installer (e.g., https://developer.nvidia.com/cuda-12-2-0-download-archive/?target_os=Windows⌖_arch=x86_64⌖_version=11⌖_type=exe_local) and install.
Create a Python 3.11 virtual environment named llama_factory:
conda create -n llama_factory python=3.11
conda activate llama_factoryClone and install LLaMA‑Factory:
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"Confirm the installation: llamafactory-cli version Check that PyTorch detects the GPU:
python -c "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available())"If torch.cuda.is_available() returns False, reinstall PyTorch with the matching CUDA wheel:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124Install bitsandbytes for Windows:
pip install bitsandbytes==0.43.1 --extra-index-url https://download.pytorch.org/whl/cu124Model download
The demo uses the Qwen2.5‑0.5B‑Instruct model. Install the modelscope tool and download the model:
pip install modelscope
modelscope download --model Qwen/Qwen2.5-0.5B-InstructThe model files are stored in the current directory; note the absolute path (e.g., E:\LLamaFactory\Qwen2.5-0.5B-Instruct).
Dataset construction
Place a JSON file in llama-factory/data. Each entry must contain three keys: instruction: user question input: optional additional input output: expected model answer
Example file alpaca_zh_demo.json is a list of such objects. After creating the file, register the dataset by adding its name to dataset_info.json under the data folder so the UI can list it.
Parameter configuration (Web UI)
Start the UI: llamafactory-cli webui Default address is http://0.0.0.0:7860. Set the following fields:
Language: zh Model name: Qwen2.5-0.5B-Instruct Model path: absolute path to the downloaded model
Fine‑tuning method: lora LoRA rank: 8 LoRA scaling factor: 16 Learning rate: 5e-5 Epochs: 2 Max gradient norm: 1.0 Max samples: 10000 Batch size: 2 Gradient accumulation steps: 8 (effective batch size 16)
Training type: Supervised Fine‑Tuning Dataset: alpaca_zh_demo Quantization: none Accelerator:
autoTraining
Press “Start” in the UI. The interface displays loss curves and logs the current step and loss value; the same information is printed to the console.
Model export and evaluation
After training, export the checkpoint to a new directory (e.g., E:\LLamaFactory\Qwen2.5-0.5B-SFT) via the “Export” tab.
Switch the UI to the “Chat” mode, load the exported model by setting the model directory to the export path, and submit a test query such as:
Identify and explain the two scientific theories in the list: cell theory and heliocentrism.The fine‑tuned model returns a domain‑specific answer that differs markedly from the original model, confirming successful adaptation. Rigorous evaluation would require a larger test set, semantic similarity metrics, or expert review.
Summary
The tutorial demonstrates that LLaMA‑Factory enables code‑free fine‑tuning of large models using LoRA, but further work is needed on dataset scaling, hyper‑parameter justification, and deployment.
Fun with Large Models
Master's graduate from Beijing Institute of Technology, published four top‑journal papers, previously worked as a developer at ByteDance and Alibaba. Currently researching large models at a major state‑owned enterprise. Committed to sharing concise, practical AI large‑model development experience, believing that AI large models will become as essential as PCs in the future. Let's start experimenting now!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
