-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can metaseg input a video and output the class label? #91
Comments
I found the solution, but a new problem has emerged. What I want to do is to segment a video and label each class. My first idea is to assign different class labels to different mask_image colors (you can see what I did for this below). However, I noticed that the output mask video changes the colors between different frames, making it difficult for me to track the labels (such as cookie/person and so on). I checked your code and found that you did the same thing to the video as the images. So, it is not surprising to get such a result. Therefore, I wonder if you could share some of your ideas regarding this. Thanks! What I did (In sam_predictor.py line 139): |
maybe this video can help you understand what happened. Take the person's arm as an example. I want to give these pixels a label according to something (here is the mask color, but the color changes with time). So is there some methods to fix it? output_video_mask.mp4 |
The video is too large (46M) to preview on the github. Here is an link: |
How did you do it: "1" means the pixel belongs to a different class, not the exact semantic label? |
Hello, I can't make it through this method. Maybe you can have a look at issue #92 . I provide some methods for this issue. |
Thanks for your great work!
I have a specific requirement for my project and I'm wondering if metaseg can cater to it. I need to input an image with dimensions HW3 (height * width * 3 channels) and obtain an "image" output with class labels in the form of HW1 (height * width * 1 channel). The "1" in this context represents that the pixels belong to different classes, rather than representing exact semantic labels.
Before I proceed, I'd like to confirm if metaseg has the capability to handle such a task. Your response would be highly valuable to me. Thank you for your time, and I'm looking forward to hearing from you.
The text was updated successfully, but these errors were encountered: