Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert class and instance binary masks to COCO instances format #858

Closed
maritum opened this issue Mar 15, 2023 · 4 comments
Closed

Convert class and instance binary masks to COCO instances format #858

maritum opened this issue Mar 15, 2023 · 4 comments
Assignees
Labels
user experience Questions about our products or things to improve user experience

Comments

@maritum
Copy link

maritum commented Mar 15, 2023

Hi everyone!

I am wondering if there is a way to convert dataset to coco-instances format by using class and instance masks.

common_semantic_segmentation format accepts only single mask, so this makes difficult to distinguish instances from the same class on the image. Is there a way to provide an instance mask with the dictionary {instance_id :class_id} or two binary masks for dataset conversion?

Thank you!

@vinnamkim vinnamkim self-assigned this Mar 16, 2023
@vinnamkim
Copy link
Contributor

vinnamkim commented Mar 16, 2023

Hi @maritum,

Thanks for your interests on our project!

Sorry for inconvenience but there is no high-level API for this functionality. Instead, I created a simple Jupyter-notebook example to solve your problem using Datumaro. Please refer to this: https://github.com/vinnamkim/datumaro/blob/cs/make-coco-instance-mask/notebooks/08_assign_label.ipynb

p.s. The visualization part of the notebook will be working after #860 is merged.

@vinnamkim vinnamkim added the user experience Questions about our products or things to improve user experience label Mar 16, 2023
@maritum
Copy link
Author

maritum commented Mar 24, 2023

Hi @vinnamkim! Thank you again for your help. It works great!

I am curious if you have any recommendations for memory optimization when using dm.Dataset.from_iterable() for large datasets. I am currently processing it by chinks and saving a temporary dataset which I then merge.Is there built-in functionality for this in datamaro?

Thank you in advance!

@vinnamkim
Copy link
Contributor

vinnamkim commented Mar 27, 2023

Hi @maritum,

Sorry, I don't get your point about memory optimization. Could you give me more details? I guess if you built your dataset with dm.Dataset.from_iterable(), the main bottleneck for the memory would be Image or Mask. If you give a raw image data to create Image (e.g. Image(data=np.array(...))), it is recommended to create it with providing an image file path, e.g., Image(path=<path/to/image>) to save the memory. On the other hand, Mask can be compressed by run-Length encoding (RLE) as follows.

import numpy as np
import pycocotools.mask as mask_utils
import datumaro as dm

binary_mask = np.array(
    [
        [1, 1, 1],
        [1, 1, 0],
        [1, 0, 0],
    ], dtype=np.uint8, requirements="F"
)

rle_binary_mask = mask_tools.encode(binary_mask)

# You can use dm.RleMask as the compressed version of dm.Mask
mask = dm.RleMask(
    id=0,
    group=0,
    image=rle_binary_mask,
    label=0
)

This example shows how to use RleMask rather than using Mask.

@vinnamkim vinnamkim reopened this Mar 27, 2023
@vinnamkim
Copy link
Contributor

vinnamkim commented Apr 4, 2023

Because there has been no response for long time, I'll close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user experience Questions about our products or things to improve user experience
Projects
None yet
Development

No branches or pull requests

2 participants