Artificial Intelligence 9 min read

Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training

This article introduces EasyCV, an open‑source all‑in‑one visual algorithm platform that abstracts diverse data sources, provides SOTA self‑supervised models, and offers ready‑to‑download datasets for image classification, object detection, segmentation, and pose estimation, complete with configuration examples.

Alibaba Cloud Big Data AI Platform

Oct 12, 2022

Unlock Vision AI: How EasyCV Streamlines Datasets and Model Training

In today’s AI‑driven world, deep learning powers many industries and data is the essential raw material for visual models.

EasyCV, an open‑source, PyTorch‑based visual algorithm toolkit from Alibaba Cloud, offers a comprehensive self‑supervised algorithm suite, SOTA Vision Transformer pre‑training models, and supports tasks such as image classification, metric learning, object detection, instance segmentation, semantic segmentation, and keypoint detection.

EasyCV abstracts various data sources, enabling direct loading of popular open datasets like CIFAR, ImageNet, COCO, as well as Alibaba PAI Itag annotations and TFRecord files, with optional DALI acceleration and caching for faster reads.

Main Dataset Overview

The platform lists and provides download links (including domestic cloud storage links) for commonly used datasets across several vision tasks.

ImageNet

Official site: https://image-net.org/download.php

Download links: ImageNet1k https://pan.baidu.com/s/13pKw0bJbr-jbymQMd_YXzA (code: 0zas); ImageNet1k TFRecord https://pan.baidu.com/s/153SY2dp02vEY9K6-O5U1UA (code: 5zdc); ImageNet21k https://pan.baidu.com/s/1eJVPCfS814cDCt3-lVHgmA (code: kaeg).

ImageNet contains over 14 million manually annotated images organized by the WordNet hierarchy.

COCO2017

Official site: https://cocodataset.org/#home

Download link: https://pan.baidu.com/s/14rO11v1VAgdswRDqPVJjMA (code: bcmm).

COCO is a large‑scale dataset for object detection, segmentation, keypoint detection, and captioning, featuring 330 k images, 1.5 M object instances, 80 object categories, and 250 k human keypoints.

LVIS

Official site: https://www.lvisdataset.org/dataset

Download link: https://pan.baidu.com/s/1UntujlgDMuVBIjhoAc_lSA (code: 8ief).

LVIS provides 164 k images with over 1 k object categories and ~2 M high‑quality instance masks, reflecting natural long‑tail distributions.

Objects365

Official site: https://www.objects365.org/overview.html

The dataset contains 630 k images covering 365 categories with 10 M bounding boxes, offering larger scale and higher quality than Pascal VOC or COCO.

Cityscapes

Official site: https://www.cityscapes-dataset.com/

Cityscapes captures street scenes from multiple European cities, providing training, validation, and test splits with 19 semantic classes.

ADE20K

Official site: http://groups.csail.mit.edu/vision/datasets/ADE20K/

Download link: https://pan.baidu.com/s/1ZuAuZheHHSDNRRdaI4wQrQ (code: dqim).

ADE20K contains 25 k complex everyday scenes with dense annotations of objects, parts, and scenes, averaging 19.5 instances and 10.5 object classes per image.

MPII Pose

Official site: http://human-pose.mpi-inf.mpg.de/

Download link: https://pan.baidu.com/s/1uscGGPlUBirulSSgb10Pfw (code: w6af).

The MPII Human Pose dataset provides over 25 k images with 40 k annotated body joints, covering 410 human activities.

EasyCV Dataset Interface Example

EasyCV abstracts data_source to encapsulate different dataset formats, outputting image information, and uses dataset_type to create task‑specific dataset objects such as ClsDataset, DetDataset, and SegDataset. Example configuration for ImageNet:

# 1. Configure ImageNet dataset

dataset_type = 'ClsDataset'

data_train_list = 'data/imagenet_raw/meta/train_labeled.txt'

data_train_root = 'data/imagenet_raw/train/'

data_test_list = 'data/imagenet_raw/meta/val_labeled.txt'

data_test_root = 'data/imagenet_raw/val/'

# Build dataset configuration
cfg = mmcv_config_fromfile(args.config)

distributed = torch.cuda.is_available() and torch.distributed.is_initialized()

default_args = dict(
    batch_size=cfg.data.imgs_per_gpu,
    workers_per_gpu=cfg.data.workers_per_gpu,
    distributed=distributed)

dataset = build_dataset(cfg.data.train, default_args)

Other datasets with similar formats can be configured by adjusting the list and root paths; detailed configuration files are available in the EasyCV repository.

Project source code: https://github.com/alibaba/EasyCV

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

image classification computer vision deep learning object detection datasets data pipelines EasyCV

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.