Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can metaseg input a video and output the class label? #91

Open
CR400AF-A opened this issue Aug 2, 2023 · 5 comments
Open

Can metaseg input a video and output the class label? #91

CR400AF-A opened this issue Aug 2, 2023 · 5 comments

Comments

@CR400AF-A
Copy link

Thanks for your great work!

I have a specific requirement for my project and I'm wondering if metaseg can cater to it. I need to input an image with dimensions HW3 (height * width * 3 channels) and obtain an "image" output with class labels in the form of HW1 (height * width * 1 channel). The "1" in this context represents that the pixels belong to different classes, rather than representing exact semantic labels.

Before I proceed, I'd like to confirm if metaseg has the capability to handle such a task. Your response would be highly valuable to me. Thank you for your time, and I'm looking forward to hearing from you.

@CR400AF-A CR400AF-A changed the title Can metaseg output the class label? Can metaseg input a video and output the class label? Aug 2, 2023
@CR400AF-A
Copy link
Author

CR400AF-A commented Aug 2, 2023

I found the solution, but a new problem has emerged.

What I want to do is to segment a video and label each class. My first idea is to assign different class labels to different mask_image colors (you can see what I did for this below). However, I noticed that the output mask video changes the colors between different frames, making it difficult for me to track the labels (such as cookie/person and so on). I checked your code and found that you did the same thing to the video as the images. So, it is not surprising to get such a result.

Therefore, I wonder if you could share some of your ideas regarding this. Thanks!

What I did (In sam_predictor.py line 139):
'''
combined_mask = mask_image # combined_mask = cv2.add(frame, mask_image)
out.write(combined_mask)
'''

@CR400AF-A
Copy link
Author

CR400AF-A commented Aug 2, 2023

maybe this video can help you understand what happened. Take the person's arm as an example. I want to give these pixels a label according to something (here is the mask color, but the color changes with time). So is there some methods to fix it?
Thanks!

output_video_mask.mp4

@CR400AF-A
Copy link
Author

The video is too large (46M) to preview on the github. Here is an link:
https://cloud.tsinghua.edu.cn/d/fefe751e32d549ad8aab/

@Snnier
Copy link

Snnier commented Aug 19, 2023

How did you do it: "1" means the pixel belongs to a different class, not the exact semantic label?

@CR400AF-A
Copy link
Author

Hello, I can't make it through this method. Maybe you can have a look at issue #92 . I provide some methods for this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants