Artificial Intelligence 11 min read

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

This guide explains how to use ModelScope’s trainer components to fine‑tune a pretrained backbone for text classification, covering dataset loading, configuration modification, trainer construction, training, evaluation, prediction, and checkpoint management with concrete code examples.

Open Source Tech Hub

Jan 28, 2024

How to Fine‑Tune a Text Classification Model with ModelScope’s PyTorch Trainer

ModelScope Training Overview

ModelScope provides a collection of pretrained models that can be used directly for inference or fine‑tuned on user data. Training consists of a train phase that updates model parameters on a training dataset and an evaluate phase that measures performance on a validation dataset.

PyTorch Training Workflow

The core training component is the EpochBasedTrainer (or its subclasses), which instantiates the model, pre‑processor, optimizer, and metrics from a configuration file. The main steps are:

Load a dataset with MsDataset.

Write a cfg_modify_fn to adjust configuration items as needed.

Construct a trainer and start training.

After training, evaluate the model.

Use the trained model for inference.

Key Trainer Constructor Parameters

model: model ID, local path, or model instance (required)
cfg_file: optional extra config file
cfg_modify_fn: optional callback to modify the config after it is read
train_dataset: training dataset (required for training)
eval_dataset: evaluation dataset (required for evaluation)
optimizers: optional custom optimizer or lr_scheduler
seed: random seed
launcher: supports pytorch/mpi/slurm for distributed training
device: cpu, gpu, gpu:0, cuda:0, etc. (default: gpu)

Example: Fine‑Tuning a Text‑Classification Model

The following example shows how to fine‑tune the damo/nlp_structbert_backbone_base_std backbone for a binary text‑similarity task using only a few dozen lines of code.

1. Load the Dataset

from modelscope.msdatasets import MsDataset
train_dataset = MsDataset.load('clue', subset_name='afqmc', split='train')
eval_dataset = MsDataset.load('clue', subset_name='afqmc', split='validation')
# Or load a custom local file
# train_dataset = MsDataset.load('/path/to/my_train_file.txt')
# eval_dataset = MsDataset.load('/path/to/my_eval_file.txt')

2. Inspect and Modify the Configuration

from modelscope.utils.hub import read_config
config = read_config(model_id)
print(config.pretty_text)

Typical modifications include:

Pre‑processor parameters : set tokenizer type, input keys, and label mapping.

Model parameters : define num_labels.

Task parameters : set task='text-classification' and pipeline type.

Training parameters : epochs, batch size, learning rate, LR scheduler total iterations, and evaluation metrics.

def cfg_modify_fn(cfg):
    cfg.preprocessor.type = 'sen-sim-tokenizer'
    cfg.preprocessor.first_sequence = 'sentence1'
    cfg.preprocessor.second_sequence = 'sentence2'
    cfg.preprocessor.label = 'label'
    cfg.preprocessor.label2id = {'0': 0, '1': 1}
    cfg.model.num_labels = 2
    cfg.task = 'text-classification'
    cfg.pipeline = {'type': 'text-classification'}
    cfg.train.max_epochs = 5
    cfg.train.work_dir = '/tmp'
    cfg.train.dataloader.batch_size_per_gpu = 32
    cfg.evaluation.dataloader.batch_size_per_gpu = 32
    cfg.train.optimizer.lr = 2e-5
    cfg.train.lr_scheduler.total_iters = int(len(train_dataset) / cfg.train.dataloader.batch_size_per_gpu) * cfg.train.max_epochs
    cfg.evaluation.metrics = 'seq-cls-metric'
    return cfg

3. Build the Trainer and Start Training

from modelscope.trainers import build_trainer
kwargs = dict(
    model=model_id,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    cfg_modify_fn=cfg_modify_fn
)
trainer = build_trainer(default_args=kwargs)
trainer.train()

After training finishes, the model can be evaluated or used for inference.

4. Model Evaluation

from modelscope.trainers import build_trainer
kwargs = dict(
    model='/tmp/output',
    eval_dataset=eval_dataset
)
trainer = build_trainer(default_args=kwargs)
trainer.evaluate()

5. Prediction and Saving Results

import numpy as np

def cfg_modify_fn(cfg):
    cfg.preprocessor.val.keep_original_columns = ['sentence1', 'sentence2']
    cfg.preprocessor.val.label = None
    return cfg

kwargs = dict(
    model='damo/nlp_structbert_sentence-similarity_chinese-tiny',
    work_dir='/tmp',
    cfg_modify_fn=cfg_modify_fn,
    remove_unused_data=True
)
trainer = build_trainer(default_args=kwargs)

def saving_fn(inputs, outputs):
    predictions = np.argmax(outputs['logits'].cpu().numpy(), axis=1)
    with open('/tmp/predicts.txt', 'a') as f:
        for s1, s2, pred in zip(inputs.sentence1, inputs.sentence2, predictions):
            f.writelines(f'{s1}, {s2}, {pred}
')

trainer.predict(predict_datasets=eval_dataset, saving_fn=saving_fn)

6. Inference with the Trained Model

from modelscope.pipelines import pipeline
pipeline_ins = pipeline('text-classification', model='/tmp/output')
result = pipeline_ins(('这个功能可用吗', '这个功能现在可用吗'))

Checkpoint Files and Continued Training

ModelScope stores checkpoint files in the output directory: {work_dir}/output: latest epoch/iteration checkpoint (requires CheckpointHook). {work_dir}/output_best: best checkpoint according to the chosen metric (requires BestCkptSaverHook). epoch_*.pth: model state_dict saved every configured interval. epoch_*_trainer_state.pth: trainer state_dict saved alongside the model.

For further training, load the .pth file and pass its path to trainer.train(checkpoint_path=...). The same checkpoint can be used for evaluation or inference by providing it to the corresponding trainer or pipeline.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

fine-tuning PyTorch text classification ModelScope Trainer

Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.