How I annotated 300 images with instance segmentation mask

This is an article about how I annotate hand dataset fast.
Firstly I'll explain annotation data formats and what I chose because this is important to choose the annotation tol.

Annotation data formats

There are several segmentation tasks and which format should we choose? Panoptic segmentation format of the coco dataset seems flexible because the panoptic dataset can be used to train instance segmentation, semantic segmentation, panoptic segmentation.
Briefly, the data format of the panoptic annotation is png image and json file. Each pixel of the image contains segment ids and the json file contains class_id and other metadata of the segments. The detail of the data format is there .
Unfortunately there's no annotation tools that supports the panoptic annotation format. So we need other options.
One of the other options is VOC format . That is supported by many tools. The png image has a class id and there's no distinction between different instances.
Other option is instance segmentation format. Coco format can contain segment information for each objects and we can utilize it.
I think the best format depends on task to solve. For some case, if the target scene to segment only contains stuff, we can choose semantic segmentation and the target scenes to segment only contain things we can choose instance segmentation format.
You can find the informal explanation of things/stuff here .

Tools

There are some tools that can be used to segmentation tasks.
Unfortunately AFAIK, there's no tools that can output panoptic coco format. So currently we need to choose an annotation tool that supports both coco instance segmentation format and voc segmentation format.
Labelme and CVAT both supports both formats.
There's a brief description of these tools.

Labelme

Labelme is really simple tool that support folder based data/task management and supports coco like format for instance segmentation and VOC like for semantic segmentation.
This is all written in python and it can be easily modified by software engineer in machine learning.

CVAT

CVAT is annotation + data/annotation task management platform.
This is really nice for larger scale team.
This tool is

It is written in Typescript and need someone who knows Typescript.

It has a auto/semi automatic annotation tools that can be potentially helpful and it can be extensible.

About dataset, it can export both VOC like format and COCO format.

Example: Hand Annotation Task

For a hand annotation task, exporting a file as coco format is suitable, because human body are things (not stuff) and it is better to treat them as a instance segmentation.
This is natural because when imagining image editing application, we want to distinguish
And for my case, I wanted to use auto/semi automatic annotation tool so I chose a CVAT.
I added a feature to auto annotate because the models CVAT provides didn't have enough precision.

How to create new auto annotation feature in CVAT

The steps to add semi auto annotation feature is like this.

Create serverless function description file.

Create handler that process the request from the CVAT and returns response

Create model handler that call model.

There's an example branch I made to add an latest interactive segmentation model from sumsung.
You can deploy this function to the nuclio by switching to my branch and running the command below.

nuctl deploy --project-name cvat --path pytorch/ritm_interactive_segmentation --file pytorch/ritm_interactive_segmentation/function-gpu.yaml

I guess you can create automatic labeling feature by adding similar files.

Data collection

It depends on application, but I collected data from cooking videos, because that contains scenes of hand holding things and can be a good training dataset.

Annotation

It takes around 2 hours to create 300 images this is around 10[s/image] in average. So not bad for a semantic segmentation annotation.
Data will be published later. Currently I cannot make it public because I couldn't find suitable storage service for this.

Conclusion

interactive segmentation model is really useful and already practical model. Cvat is easily extensible and I recommend it.

Reference

이 문제에 관하여(How I annotated 300 images with instance segmentation mask), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://zenn.dev/xiongjie/articles/f6ffdbca901158

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다