Integrating YOLOv5 and MMDetection with Label Studio via a Custom ML Backend
This guide explains how to build a custom Label Studio ML backend by extending LabelStudioMLBase to wrap a YOLOv5 or MMDetection model, modify its prediction logic, launch the service, and configure the frontend for automated object‑detection annotation, including deployment details and a recruitment notice.
After introducing how to create a labeling project and train a YOLOv5 model, the article shows how to construct a backend prediction model class by inheriting label_studio_ml.model.LabelStudioMLBase and customizing it for MMDetection.
1. Constructing the prediction model class – The example opens mmdetection.py from the label-studio-ml-backend/label_studio_ml/examples/mmdetection directory and defines a class MMDetection with an __init__ method that loads the MMDetection config and checkpoint, sets up label mappings, and prepares image handling. Helper methods _get_image_url and predict retrieve images (including presigned S3 URLs), run inference, filter results by a score threshold, convert bounding‑box coordinates to percentages, and return a list of rectangle label results together with an average confidence score.
import os
import logging
import boto3
import io
import json
from mmdet.apis import init_detector, inference_detector
from label_studio_ml.model import LabelStudioMLBase
from label_studio_ml.utils import get_image_size, get_single_tag_keys, DATA_UNDEFINED_NAME
from label_studio_tools.core.utils.io import get_data_dir
from botocore.exceptions import ClientError
from urllib.parse import urlparse
logger = logging.getLogger(__name__)
class MMDetection(LabelStudioMLBase):
"""Object detector based on https://github.com/open-mmlab/mmdetection"""
def __init__(self, config_file="../mmdetection/config_file/", checkpoint_file="../mmdetection/checkpoint_file/", image_dir=None, labels_file=None, score_threshold=0.3, device='cpu', **kwargs):
"""Load MMDetection model from config and checkpoint into memory.
...
"""
super(MMDetection, self).__init__(**kwargs)
config_file = config_file or os.environ['config_file']
checkpoint_file = checkpoint_file or os.environ['checkpoint_file']
self.config_file = config_file
self.checkpoint_file = checkpoint_file
self.labels_file = labels_file
upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
self.image_dir = image_dir or upload_dir
logger.debug(f'{self.__class__.__name__} reads images from {self.image_dir}')
if self.labels_file and os.path.exists(self.labels_file):
self.label_map = json_load(self.labels_file)
else:
self.label_map = {}
self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(
self.parsed_label_config, 'RectangleLabels', 'Image')
schema = list(self.parsed_label_config.values())[0]
self.labels_in_config = set(self.labels_in_config)
self.labels_attrs = schema.get('labels_attrs')
if self.labels_attrs:
for label_name, label_attrs in self.labels_attrs.items():
for predicted_value in label_attrs.get('predicted_values', '').split(','):
self.label_map[predicted_value] = label_name
print('Load new model from: ', config_file, checkpoint_file)
self.model = init_detector(config_file, checkpoint_file, device=device)
self.score_thresh = score_threshold
def _get_image_url(self, task):
image_url = task['data'].get(self.value) or task['data'].get(DATA_UNDEFINED_NAME)
if image_url.startswith('s3://'):
r = urlparse(image_url, allow_fragments=False)
bucket_name = r.netloc
key = r.path.lstrip('/')
client = boto3.client('s3')
try:
image_url = client.generate_presigned_url(
ClientMethod='get_object',
Params={'Bucket': bucket_name, 'Key': key}
)
except ClientError as exc:
logger.warning(f"Can't generate presigned URL for {image_url}. Reason: {exc}")
return image_url
def predict(self, tasks, **kwargs):
assert len(tasks) == 1
task = tasks[0]
image_url = self._get_image_url(task)
image_path = self.get_local_path(image_url)
model_results = inference_detector(self.model, image_path)
results = []
all_scores = []
img_width, img_height = get_image_size(image_path)
for bboxes, label in zip(model_results, self.model.CLASSES):
output_label = self.label_map.get(label, label)
if output_label not in self.labels_in_config:
print(output_label + ' label not found in project config.')
continue
for bbox in bboxes:
bbox = list(bbox)
if not bbox:
continue
score = float(bbox[-1])
if score < self.score_thresh:
continue
x, y, xmax, ymax = bbox[:4]
results.append({
'from_name': self.from_name,
'to_name': self.to_name,
'type': 'rectanglelabels',
'value': {
'rectanglelabels': [output_label],
'x': x / img_width * 100,
'y': y / img_height * 100,
'width': (xmax - x) / img_width * 100,
'height': (ymax - y) / img_height * 100,
},
'score': score
})
all_scores.append(score)
avg_score = sum(all_scores) / max(len(all_scores), 1)
return [{'result': results, 'score': avg_score}]
def json_load(file, int_keys=False):
with io.open(file, encoding='utf8') as f:
data = json.load(f)
if int_keys:
return {int(k): v for k, v in data.items()}
else:
return dataThe article then presents a second custom model MyModel that wraps a YOLOv5 engine, showing how to adapt the predict method to handle YOLOv5’s box format (left‑top and right‑bottom coordinates) and convert them to the percentage format required by Label Studio.
class MyModel(LabelStudioMLBase):
def __init__(self, image_dir=None, labels_file=None, device='cpu', **kwargs):
'''loading models to global objects'''
super(MyModel, self).__init__(**kwargs)
self.detector = Yolo5Engine('_source_/bans8')
upload_dir = os.path.join(get_data_dir(), 'media', 'upload')
self.image_dir = image_dir or upload_dir
self.from_name, self.to_name, self.value, self.labels_in_config = get_single_tag_keys(
self.parsed_label_config, 'RectangleLabels', 'Image')
schema = list(self.parsed_label_config.values())[0]
self.labels_in_config = set(self.labels_in_config)
def _get_image_url(self, task):
image_url = task['data'].get(self.value) or task['data'].get(DATA_UNDEFINED_NAME)
if image_url.startswith('s3://'):
r = urlparse(image_url, allow_fragments=False)
bucket_name = r.netloc
key = r.path.lstrip('/')
client = boto3.client('s3')
try:
image_url = client.generate_presigned_url(
ClientMethod='get_object',
Params={'Bucket': bucket_name, 'Key': key}
)
except ClientError as exc:
pass
return image_url
def predict(self, tasks, **kwargs):
task = tasks[0]
image_url = self._get_image_url(task)
image_path = self.get_local_path(image_url)
self.boxes = []
self.labels = []
self.confs = []
for i in range(len(self.detector.detection(image_path))):
self.boxes.append(self.detector.detection(image_path)[i]["box"])
self.labels.append(self.detector.detection(image_path)[i]["key"])
self.confs.append(self.detector.detection(image_path)[i]["score"])
results = []
img_width, img_height = get_image_size(image_path)
for id, bbox in enumerate(self.boxes):
label = self.labels[id]
conf = self.confs[id]
x, y, x2, y2 = bbox
w = x2 - x
h = y - y2
y = y2
if label not in self.labels_in_config:
print(label + ' label not found in project config.')
continue
results.append({
'from_name': self.from_name,
'to_name': self.to_name,
'type': 'rectanglelabels',
'value': {
'rectanglelabels': [label],
'x': int(x / img_width * 100),
'y': int(y / img_height * 100),
'width': int(w / img_width * 100),
'height': int(h / img_height * 100)
},
'score': int(conf * 100)
})
avg_score = int(sum(self.confs) / max(len(self.confs), 1))
return [{'result': results, 'score': avg_score}]2. Starting the ML backend – The commands to initialize and start the backend are provided: label-studio-ml init my_ml_backend --script label_studio_ml/examples/simple_text_classifier/simple_text_classifier.py label-studio-ml start my_ml_backend The service runs on port 9090 by default, which can be changed in the _wsgi.py script.
3. Configuring Label Studio – After the backend is running, users must add a new model in the Label Studio UI (Machine Learning → Add Model), specify the model name and URL, and verify the "Connected" status. Screenshots illustrate each step.
4. Summary – The guide demonstrates end‑to‑end integration of YOLOv5/MMDetection with Label Studio for automated object‑detection labeling, and notes that Label Studio also supports audio and text pre‑labeling via other examples.
Recruitment Notice – At the end of the article, a hiring announcement for the Zero technology team in Hangzhou is included, describing the team’s focus areas (cloud‑native, blockchain, AI, big data, etc.) and inviting interested candidates to contact [email protected] .
政采云技术
ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.