How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer
This guide explains how to use ModelScope’s trainer components to fine‑tune a pretrained backbone for text classification, covering dataset loading, configuration modification, trainer construction, training, evaluation, prediction, and checkpoint management with concrete code examples.
ModelScope Training Overview
ModelScope provides a collection of pretrained models that can be used directly for inference or fine‑tuned on user data. Training consists of a train phase that updates model parameters on a training dataset and an evaluate phase that measures performance on a validation dataset.
PyTorch Training Workflow
The core training component is the EpochBasedTrainer (or its subclasses), which instantiates the model, pre‑processor, optimizer, and metrics from a configuration file. The main steps are:
Load a dataset with MsDataset.
Write a cfg_modify_fn to adjust configuration items as needed.
Construct a trainer and start training.
After training, evaluate the model.
Use the trained model for inference.
Key Trainer Constructor Parameters
model: model ID, local path, or model instance (required)
cfg_file: optional extra config file
cfg_modify_fn: optional callback to modify the config after it is read
train_dataset: training dataset (required for training)
eval_dataset: evaluation dataset (required for evaluation)
optimizers: optional custom optimizer or lr_scheduler
seed: random seed
launcher: supports pytorch/mpi/slurm for distributed training
device: cpu, gpu, gpu:0, cuda:0, etc. (default: gpu)Example: Fine‑Tuning a Text‑Classification Model
The following example shows how to fine‑tune the damo/nlp_structbert_backbone_base_std backbone for a binary text‑similarity task using only a few dozen lines of code.
1. Load the Dataset
from modelscope.msdatasets import MsDataset
train_dataset = MsDataset.load('clue', subset_name='afqmc', split='train')
eval_dataset = MsDataset.load('clue', subset_name='afqmc', split='validation')
# Or load a custom local file
# train_dataset = MsDataset.load('/path/to/my_train_file.txt')
# eval_dataset = MsDataset.load('/path/to/my_eval_file.txt')2. Inspect and Modify the Configuration
from modelscope.utils.hub import read_config
config = read_config(model_id)
print(config.pretty_text)Typical modifications include:
Pre‑processor parameters : set tokenizer type, input keys, and label mapping.
Model parameters : define num_labels.
Task parameters : set task='text-classification' and pipeline type.
Training parameters : epochs, batch size, learning rate, LR scheduler total iterations, and evaluation metrics.
def cfg_modify_fn(cfg):
cfg.preprocessor.type = 'sen-sim-tokenizer'
cfg.preprocessor.first_sequence = 'sentence1'
cfg.preprocessor.second_sequence = 'sentence2'
cfg.preprocessor.label = 'label'
cfg.preprocessor.label2id = {'0': 0, '1': 1}
cfg.model.num_labels = 2
cfg.task = 'text-classification'
cfg.pipeline = {'type': 'text-classification'}
cfg.train.max_epochs = 5
cfg.train.work_dir = '/tmp'
cfg.train.dataloader.batch_size_per_gpu = 32
cfg.evaluation.dataloader.batch_size_per_gpu = 32
cfg.train.optimizer.lr = 2e-5
cfg.train.lr_scheduler.total_iters = int(len(train_dataset) / cfg.train.dataloader.batch_size_per_gpu) * cfg.train.max_epochs
cfg.evaluation.metrics = 'seq-cls-metric'
return cfg3. Build the Trainer and Start Training
from modelscope.trainers import build_trainer
kwargs = dict(
model=model_id,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
cfg_modify_fn=cfg_modify_fn
)
trainer = build_trainer(default_args=kwargs)
trainer.train()After training finishes, the model can be evaluated or used for inference.
4. Model Evaluation
from modelscope.trainers import build_trainer
kwargs = dict(
model='/tmp/output',
eval_dataset=eval_dataset
)
trainer = build_trainer(default_args=kwargs)
trainer.evaluate()5. Prediction and Saving Results
import numpy as np
def cfg_modify_fn(cfg):
cfg.preprocessor.val.keep_original_columns = ['sentence1', 'sentence2']
cfg.preprocessor.val.label = None
return cfg
kwargs = dict(
model='damo/nlp_structbert_sentence-similarity_chinese-tiny',
work_dir='/tmp',
cfg_modify_fn=cfg_modify_fn,
remove_unused_data=True
)
trainer = build_trainer(default_args=kwargs)
def saving_fn(inputs, outputs):
predictions = np.argmax(outputs['logits'].cpu().numpy(), axis=1)
with open('/tmp/predicts.txt', 'a') as f:
for s1, s2, pred in zip(inputs.sentence1, inputs.sentence2, predictions):
f.writelines(f'{s1}, {s2}, {pred}
')
trainer.predict(predict_datasets=eval_dataset, saving_fn=saving_fn)6. Inference with the Trained Model
from modelscope.pipelines import pipeline
pipeline_ins = pipeline('text-classification', model='/tmp/output')
result = pipeline_ins(('这个功能可用吗', '这个功能现在可用吗'))Checkpoint Files and Continued Training
ModelScope stores checkpoint files in the output directory: {work_dir}/output: latest epoch/iteration checkpoint (requires CheckpointHook). {work_dir}/output_best: best checkpoint according to the chosen metric (requires BestCkptSaverHook). epoch_*.pth: model state_dict saved every configured interval. epoch_*_trainer_state.pth: trainer state_dict saved alongside the model.
For further training, load the .pth file and pass its path to trainer.train(checkpoint_path=...). The same checkpoint can be used for evaluation or inference by providing it to the corresponding trainer or pipeline.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
