Artificial Intelligence 15 min read

Fine‑Tune Your Own Large Model in 5 Minutes Without Writing Code (Using LLaMA‑Factory on Qwen)

This guide walks you through fine‑tuning a large language model without any coding by using LLaMA‑Factory, covering LoRA fundamentals, environment setup, dataset creation, parameter configuration, training, loss monitoring, model export, and a quick evaluation on the Qwen2.5‑0.5B model.

Fun with Large Models

Mar 14, 2025

Fine‑Tune Your Own Large Model in 5 Minutes Without Writing Code (Using LLaMA‑Factory on Qwen)

Large language models such as DeepSeek and Qwen have strong general capabilities but often underperform on specialized tasks. Fine‑tuning with domain‑specific data adapts a generalist model into a domain expert.

Fine‑tuning principle

Full‑parameter fine‑tuning updates every weight and requires large compute. Efficient methods such as LoRA add a small set of trainable matrices (rank‑reduced up‑ and down‑projection) while keeping the original weights frozen, reducing memory and compute.

LLaMA‑Factory overview

LLaMA‑Factory (https://github.com/hiyouga/LLaMA-Factory) is an open‑source low‑code framework that supports over 100 models, provides a web UI, multiple dataset formats, and fine‑tuning algorithms including LoRA and DPO.

Environment preparation

Install the CUDA driver, a GPU‑enabled PyTorch build, and other dependencies. Anaconda is recommended to avoid package conflicts.

Verify GPU availability:

nvidia-smi
nvcc -v

If the commands do not show the expected output, download the appropriate CUDA installer (e.g., https://developer.nvidia.com/cuda-12-2-0-download-archive/?target_os=Windows⌖_arch=x86_64⌖_version=11⌖_type=exe_local) and install.

Create a Python 3.11 virtual environment named llama_factory:

conda create -n llama_factory python=3.11
conda activate llama_factory

Clone and install LLaMA‑Factory:

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

Confirm the installation: llamafactory-cli version Check that PyTorch detects the GPU:

python -c "import torch; print(torch.__version__); print(torch.version.cuda); print(torch.cuda.is_available())"

If torch.cuda.is_available() returns False, reinstall PyTorch with the matching CUDA wheel:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

Install bitsandbytes for Windows:

pip install bitsandbytes==0.43.1 --extra-index-url https://download.pytorch.org/whl/cu124

Model download

The demo uses the Qwen2.5‑0.5B‑Instruct model. Install the modelscope tool and download the model:

pip install modelscope
modelscope download --model Qwen/Qwen2.5-0.5B-Instruct

The model files are stored in the current directory; note the absolute path (e.g., E:\LLamaFactory\Qwen2.5-0.5B-Instruct).

Dataset construction

Place a JSON file in llama-factory/data. Each entry must contain three keys: instruction: user question input: optional additional input output: expected model answer

Example file alpaca_zh_demo.json is a list of such objects. After creating the file, register the dataset by adding its name to dataset_info.json under the data folder so the UI can list it.

Parameter configuration (Web UI)

Start the UI: llamafactory-cli webui Default address is http://0.0.0.0:7860. Set the following fields:

Language: zh Model name: Qwen2.5-0.5B-Instruct Model path: absolute path to the downloaded model

Fine‑tuning method: lora LoRA rank: 8 LoRA scaling factor: 16 Learning rate: 5e-5 Epochs: 2 Max gradient norm: 1.0 Max samples: 10000 Batch size: 2 Gradient accumulation steps: 8 (effective batch size 16)

Training type: Supervised Fine‑Tuning Dataset: alpaca_zh_demo Quantization: none Accelerator:

auto

Training

Press “Start” in the UI. The interface displays loss curves and logs the current step and loss value; the same information is printed to the console.

Model export and evaluation

After training, export the checkpoint to a new directory (e.g., E:\LLamaFactory\Qwen2.5-0.5B-SFT) via the “Export” tab.

Switch the UI to the “Chat” mode, load the exported model by setting the model directory to the export path, and submit a test query such as:

Identify and explain the two scientific theories in the list: cell theory and heliocentrism.

The fine‑tuned model returns a domain‑specific answer that differs markedly from the original model, confirming successful adaptation. Rigorous evaluation would require a larger test set, semantic similarity metrics, or expert review.

Summary

The tutorial demonstrates that LLaMA‑Factory enables code‑free fine‑tuning of large models using LoRA, but further work is needed on dataset scaling, hyper‑parameter justification, and deployment.

Python LoRA model fine-tuning Qwen LLaMA‑Factory Anaconda

Written by

Fun with Large Models

Master's graduate from Beijing Institute of Technology, published four top‑journal papers, previously worked as a developer at ByteDance and Alibaba. Currently researching large models at a major state‑owned enterprise. Committed to sharing concise, practical AI large‑model development experience, believing that AI large models will become as essential as PCs in the future. Let's start experimenting now!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.