Artificial Intelligence 14 min read

Integrating YOLOv5 and MMDetection with Label Studio via a Custom ML Backend

This guide explains how to build a custom Label Studio ML backend by extending LabelStudioMLBase to wrap a YOLOv5 or MMDetection model, modify its prediction logic, launch the service, and configure the frontend for automated object‑detection annotation, including deployment details and a recruitment notice.

政采云技术
政采云技术
政采云技术
Integrating YOLOv5 and MMDetection with Label Studio via a Custom ML Backend

After introducing how to create a labeling project and train a YOLOv5 model, the article shows how to construct a backend prediction model class by inheriting label_studio_ml.model.LabelStudioMLBase and customizing it for MMDetection.

1. Constructing the prediction model class – The example opens mmdetection.py from the label-studio-ml-backend/label_studio_ml/examples/mmdetection directory and defines a class MMDetection with an __init__ method that loads the MMDetection config and checkpoint, sets up label mappings, and prepares image handling. Helper methods _get_image_url and predict retrieve images (including presigned S3 URLs), run inference, filter results by a score threshold, convert bounding‑box coordinates to percentages, and return a list of rectangle label results together with an average confidence score.

import os
import logging
import boto3
import io
import json

from mmdet.apis import init_detector, inference_detector
from label_studio_ml.model import LabelStudioMLBase
from label_studio_ml.utils import get_image_size, get_single_tag_keys, DATA_UNDEFINED_NAME
from label_studio_tools.core.utils.io import get_data_dir
from botocore.exceptions import ClientError
from urllib.parse import urlparse

logger = logging.getLogger(__name__)

class MMDetection(LabelStudioMLBase):
    """Object detector based on https://github.com/open-mmlab/mmdetection"""
    def __init__(self, config_file="../mmdetection/config_file/", checkpoint_file="../mmdetection/checkpoint_file/", image_dir=None, labels_file=None, score_threshold=0.3, device='cpu', **kwargs):
        """Load MMDetection model from config and checkpoint into memory.
        ...
        """
        super(MMDetection, self).__init__(**kwargs)
        config_file = config_file or os.environ['config_file']
        checkpoint_file = checkpoint_file or os.environ['checkpoint_file']
        self.config_file = config_file
        self.checkpoint_file = checkpoint_file
        self.labels_file = labels_file
        upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
        self.image_dir = image_dir or upload_dir
        logger.debug(f'{self.__class__.__name__} reads images from {self.image_dir}')
        if self.labels_file and os.path.exists(self.labels_file):
            self.label_map = json_load(self.labels_file)
        else:
            self.label_map = {}
        self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(
            self.parsed_label_config, 'RectangleLabels', 'Image')
        schema = list(self.parsed_label_config.values())[0]
        self.labels_in_config = set(self.labels_in_config)
        self.labels_attrs = schema.get('labels_attrs')
        if self.labels_attrs:
            for label_name, label_attrs in self.labels_attrs.items():
                for predicted_value in label_attrs.get('predicted_values', '').split(','):
                    self.label_map[predicted_value] = label_name
        print('Load new model from: ', config_file, checkpoint_file)
        self.model = init_detector(config_file, checkpoint_file, device=device)
        self.score_thresh = score_threshold
    def _get_image_url(self, task):
        image_url = task['data'].get(self.value) or task['data'].get(DATA_UNDEFINED_NAME)
        if image_url.startswith('s3://'):
            r = urlparse(image_url, allow_fragments=False)
            bucket_name = r.netloc
            key = r.path.lstrip('/')
            client = boto3.client('s3')
            try:
                image_url = client.generate_presigned_url(
                    ClientMethod='get_object',
                    Params={'Bucket': bucket_name, 'Key': key}
                )
            except ClientError as exc:
                logger.warning(f"Can't generate presigned URL for {image_url}. Reason: {exc}")
        return image_url
    def predict(self, tasks, **kwargs):
        assert len(tasks) == 1
        task = tasks[0]
        image_url = self._get_image_url(task)
        image_path = self.get_local_path(image_url)
        model_results = inference_detector(self.model, image_path)
        results = []
        all_scores = []
        img_width, img_height = get_image_size(image_path)
        for bboxes, label in zip(model_results, self.model.CLASSES):
            output_label = self.label_map.get(label, label)
            if output_label not in self.labels_in_config:
                print(output_label + ' label not found in project config.')
                continue
            for bbox in bboxes:
                bbox = list(bbox)
                if not bbox:
                    continue
                score = float(bbox[-1])
                if score < self.score_thresh:
                    continue
                x, y, xmax, ymax = bbox[:4]
                results.append({
                    'from_name': self.from_name,
                    'to_name': self.to_name,
                    'type': 'rectanglelabels',
                    'value': {
                        'rectanglelabels': [output_label],
                        'x': x / img_width * 100,
                        'y': y / img_height * 100,
                        'width': (xmax - x) / img_width * 100,
                        'height': (ymax - y) / img_height * 100,
                    },
                    'score': score
                })
                all_scores.append(score)
        avg_score = sum(all_scores) / max(len(all_scores), 1)
        return [{'result': results, 'score': avg_score}]

def json_load(file, int_keys=False):
    with io.open(file, encoding='utf8') as f:
        data = json.load(f)
        if int_keys:
            return {int(k): v for k, v in data.items()}
        else:
            return data

The article then presents a second custom model MyModel that wraps a YOLOv5 engine, showing how to adapt the predict method to handle YOLOv5’s box format (left‑top and right‑bottom coordinates) and convert them to the percentage format required by Label Studio.

class MyModel(LabelStudioMLBase):
    def __init__(self, image_dir=None, labels_file=None, device='cpu', **kwargs):
        '''loading models to global objects'''
        super(MyModel, self).__init__(**kwargs)
        self.detector = Yolo5Engine('_source_/bans8')
        upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
        self.image_dir = image_dir or upload_dir
        self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(
            self.parsed_label_config, 'RectangleLabels', 'Image')
        schema = list(self.parsed_label_config.values())[0]
        self.labels_in_config = set(self.labels_in_config)
    def _get_image_url(self, task):
        image_url = task['data'].get(self.value) or task['data'].get(DATA_UNDEFINED_NAME)
        if image_url.startswith('s3://'):
            r = urlparse(image_url, allow_fragments=False)
            bucket_name = r.netloc
            key = r.path.lstrip('/')
            client = boto3.client('s3')
            try:
                image_url = client.generate_presigned_url(
                    ClientMethod='get_object',
                    Params={'Bucket': bucket_name, 'Key': key}
                )
            except ClientError as exc:
                pass
        return image_url
    def predict(self, tasks, **kwargs):
        task = tasks[0]
        image_url = self._get_image_url(task)
        image_path = self.get_local_path(image_url)
        self.boxes = []
        self.labels = []
        self.confs = []
        for i in range(len(self.detector.detection(image_path))):
            self.boxes.append(self.detector.detection(image_path)[i]["box"])
            self.labels.append(self.detector.detection(image_path)[i]["key"])
            self.confs.append(self.detector.detection(image_path)[i]["score"])
        results = []
        img_width, img_height = get_image_size(image_path)
        for id, bbox in enumerate(self.boxes):
            label = self.labels[id]
            conf = self.confs[id]
            x, y, x2, y2 = bbox
            w = x2 - x
            h = y - y2
            y = y2
            if label not in self.labels_in_config:
                print(label + ' label not found in project config.')
                continue
            results.append({
                'from_name': self.from_name,
                'to_name': self.to_name,
                'type': 'rectanglelabels',
                'value': {
                    'rectanglelabels': [label],
                    'x': int(x / img_width * 100),
                    'y': int(y / img_height * 100),
                    'width': int(w / img_width * 100),
                    'height': int(h / img_height * 100)
                },
                'score': int(conf * 100)
            })
        avg_score = int(sum(self.confs) / max(len(self.confs), 1))
        return [{'result': results, 'score': avg_score}]

2. Starting the ML backend – The commands to initialize and start the backend are provided: label-studio-ml init my_ml_backend --script label_studio_ml/examples/simple_text_classifier/simple_text_classifier.py label-studio-ml start my_ml_backend The service runs on port 9090 by default, which can be changed in the _wsgi.py script.

3. Configuring Label Studio – After the backend is running, users must add a new model in the Label Studio UI (Machine Learning → Add Model), specify the model name and URL, and verify the "Connected" status. Screenshots illustrate each step.

4. Summary – The guide demonstrates end‑to‑end integration of YOLOv5/MMDetection with Label Studio for automated object‑detection labeling, and notes that Label Studio also supports audio and text pre‑labeling via other examples.

Recruitment Notice – At the end of the article, a hiring announcement for the Zero technology team in Hangzhou is included, describing the team’s focus areas (cloud‑native, blockchain, AI, big data, etc.) and inviting interested candidates to contact [email protected] .

PythonYOLOv5LabelStudioMLBackendMMDetectionObjectDetection
政采云技术
Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.