SWIFT: A Scalable Light‑Weight Training and Inference Framework for Efficient Model Fine‑Tuning
SWIFT is an open‑source, PyTorch‑based framework that integrates multiple efficient fine‑tuning methods such as LoRA, QLoRA, Adapter, and the proprietary ResTuning, enabling developers to fine‑tune large language and multimodal models on consumer‑grade GPUs with significantly reduced memory and compute requirements.
With the rapid development of big data and powerful distributed parallel computing, the pre‑training + fine‑tuning paradigm has become mainstream in deep learning, leading to a surge of large models (GPT‑4, Llama, ChatGLM, Baichuan, RWKV, Stable‑Diffusion, etc.) that demand tens to hundreds of gigabytes of GPU memory, which is unaffordable for most labs and individual developers.
To address this, the industry has introduced efficient fine‑tuning techniques such as Adapter‑Tuning, Prompt‑Tuning, LoRA, and QLoRA. Building on these, the ModelScope community released SWIFT, a complete lightweight training‑inference toolkit that lets AI enthusiasts use their consumer‑grade GPUs to experiment with large models and AIGC.
SWIFT (Scalable Light‑weight Infrastructure for Fine‑tuning) is a PyTorch‑based, out‑of‑the‑box framework that bundles open‑source tuners (LoRA, QLoRA, Adapter, etc.) and ModelScope’s own ResTuning. It supports all Transformer architectures as well as other deep‑learning models, allowing developers to create fine‑tunable models with a single line of code while achieving parameter‑, memory‑, and time‑efficient training.
SWIFT integrates seamlessly with the ModelScope ecosystem, covering dataset loading, model download, training, inference, and model upload. It is fully compatible with PEFT, so users familiar with PEFT can directly apply SWIFT’s capabilities to ModelScope models.
The proprietary ResTuning tuner, validated on CV and multimodal tasks, can save 30%‑60% of GPU memory compared with other methods while delivering comparable performance.
SWIFT provides ready‑to‑use fine‑tuning and inference scripts that run on consumer GPUs. Installation is straightforward:
pip install ms-swift -Uor via source:
git clone https://github.com/modelscope/swift.git
cd swift
pip install -r requirements/framework.txt
pip install -e .Docker images are also available:
docker pull registry.cn-hangzhou.aliyuncs.com/modelscope-repo/modelscope:ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.1Typical usage involves preparing a model, configuring a tuner, and calling Swift.prepare_model :
from swift import Swift, LoRAConfig
config = LoRAConfig(...)
model = Swift.prepare_model(model, config, extra_state_keys=['...'])
# proceed with trainingMultiple tuners can be combined in a single model, and different tuners can be activated per thread during inference, enabling efficient multi‑user scenarios without significant memory overhead.
SWIFT supports a wide range of models (Qwen, Baichuan, ChatGLM, Llama, InternLM, Stable Diffusion, etc.), datasets (Alpaca, Code‑Alpaca, medical, legal, math, SQL, classification, multimodal, etc.), and fine‑tuning methods (LoRA, QLoRA, ResTuning, Side‑Tuning, Prompt‑Tuning, Adapter, LongLoRA, NEFTune, ROME, full‑parameter). It also leverages techniques such as model quantization, model parallelism, gradient checkpointing, gradient accumulation, FlashAttention, XFormers, and DDP to further reduce memory usage and accelerate training.
Future plans for SWIFT include adding evaluation capabilities, deployment pipelines for LLMs and AGI‑C models, more SOTA tuners, and broader one‑click support for additional models, ensuring it remains a versatile tool for the evolving large‑model era.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.