Master YOLOv12: A Step‑by‑Step Guide to Build, Train, and Deploy Custom Models
This tutorial walks readers through the fundamentals of YOLOv12, covering model variants, dataset preparation with Roboflow, optional FlashAttention acceleration, installation, model selection, training commands, post‑training tasks such as tracking, validation, inference, exporting to ONNX, and benchmarking, all with concrete code snippets and practical tips.
What is YOLOv12?
YOLOv12 (You Only Look Once v12) is a state‑of‑the‑art computer‑vision model that supports six tasks: classification, object detection, instance segmentation, multi‑object tracking, human pose estimation, and oriented bounding boxes (OBB). The model family provides five size variants—Nano (N), Small (S), Medium (M), Large (L) and eXtra‑large (X)—so users can balance accuracy against memory and latency requirements.
Classify : Detect objects and assign class labels.
Detect : Locate objects with axis‑aligned bounding boxes.
Segment : Produce pixel‑level masks for each object.
Track : Extend detection to video streams, assigning consistent IDs across frames.
Pose : Estimate human skeletal keypoints.
OBB : Predict rotated bounding boxes for arbitrarily oriented objects.
Creating a Dataset
Collect all images in a single directory and upload the directory to Roboflow. Choose the project type that matches the intended task:
Object Detection → "Object Detection"
Classification → "Classification"
Segmentation → "Instance Segmentation"
Use Roboflow’s annotation tools (bounding‑box selector, polygon drawer, or AI‑assisted helpers) to label the images. After annotation, run the health‑check, create a version, and export the dataset in YOLO format as a zip file. Unzip the archive before proceeding.
Optional: FlashAttention on NVIDIA GPUs
If an NVIDIA GPU with compute capability ≥ 7.0 is available, compiling FlashAttention can reduce inference latency and accelerate training. The source code and build instructions are provided in the YOLOv12 repository (see URL below). Users without compatible hardware may skip this step.
Repository: https://github.com/sunsmarterjie/yolov12
Exporting and Training
Install the YOLOv12 Python package from source and remove any conflicting ultralytics installation:
pip uninstall ultralytics
pip install git+https://github.com/sunsmarterjie/yolov12.gitVerify the installation by running the CLI command: yolo Select a model size and, for non‑detection tasks, append the appropriate suffix:
Segmentation → -seg Classification → -cls Detection/Tracking → no suffix
The resulting model file follows the pattern yolov12<size><suffix>.yaml/.pt. Example filenames:
yolov12x-seg.yaml yolov12m-cls.ptTraining with Python
from ultralytics import YOLO
# Replace INSERT_MODEL_NAME with the desired pretrained checkpoint, e.g. "yolov12m.pt"
model = YOLO('INSERT_MODEL_NAME')
# PATH_TO_DATASET points to the directory that contains the YOLO‑format data.yaml file
results = model.train(data='PATH_TO_DATASET', epochs=50, imgsz=640)Training progress, checkpoints and final weights are saved under the runs directory, inside a sub‑folder named after the task (e.g., detect/train) followed by a numeric run identifier.
Post‑Training Options
Tracking
from ultralytics import YOLO
model = YOLO('PATH_TO_MODEL')
results = model.track('YOUTUBE_VIDEO_URL', show=True)Model Validation
from ultralytics import YOLO
model = YOLO('PATH_TO_MODEL')
metrics = model.val()
# mAP metrics
print('mAP50‑95:', metrics.box.map)
print('mAP50:', metrics.box.map50)Inference on New Images
from ultralytics import YOLO
model = YOLO('PATH_TO_MODEL')
results = model(["im1.jpg", "im2.jpg"])
for r in results:
boxes = r.boxes # bounding boxes
masks = r.masks # segmentation masks (if available)
keypoints = r.keypoints # pose keypoints (if available)
probs = r.probs # classification probabilities
obb = r.obb # oriented bounding boxes
r.show() # display result
r.save(filename="result.jpg")Export to Other Formats (e.g., ONNX)
from ultralytics import YOLO
model = YOLO('PATH_TO_MODEL')
model.export(format="onnx")Benchmarking
from ultralytics.utils.benchmarks import benchmark
benchmark(model='PATH_TO_MODEL', data='PATH_TO_DATASET', imgsz=640, half=False)Reference Repository
All code and model definitions are hosted at https://github.com/sunsmarterjie/yolov12.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
