Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with long videos. (1530 frames). Video Quality (720p). #70

Closed
charchit7 opened this issue May 22, 2024 · 5 comments
Closed

Issues with long videos. (1530 frames). Video Quality (720p). #70

charchit7 opened this issue May 22, 2024 · 5 comments

Comments

@charchit7
Copy link

Hey, @hkchengrex I was runnning the demo colab script to test out an object (sofa) in my case. And it failed after few frames.
Do you know why is that the case?
I am assuming because a human subject came into the picture and went away the memory propagation failed? ( issues with the original Xmem paper).

Please let me know your thoughts on it. (check_frame_human is the image when the human came into the frame).
check_frame2
check_frame3
check_frame
check_frame_human

@hkchengrex
Copy link
Owner

I am not sure what I am looking at here. Most of these masks look decent -- or am I parsing the scene wrong?
Secondly you mentioned that it fails after a few frames, but these frames do not look continuous.

@charchit7
Copy link
Author

charchit7 commented May 24, 2024

Yeah, I shared intermediate frames for reference. Failed case is if you look at the first image given above.
Sharing result and input videos with you : Drive Link
Input : obj1_720p.mp4, result : results_obj1.mp4

Please let me know whenever you get time to check. @hkchengrex Thanks :)
Have two questions wrt results :

  • Why is the output video not segmenting only one object which is the sofa. ( you can see the color variations in the video frames).
  • You can see the segmentation getting distorted in the later frames. I used your colab tutorial, how can I segment the masks separately for visualisation.

@hkchengrex
Copy link
Owner

Thank you for providing the videos.
To preface, all models make mistakes and Cutie is no exception. Most of the frames look fairly reasonable to me. There might be a few things that can improve the performance of the model, but your millage might vary:

  1. It seems that you are mixing up RGB and BGR orderings in the visualization. Cutie uses RGB.
  2. The result video has a lower framerate than the source video. Use high frame rate input for Cutie if possible.
  3. The only noticeable error I see comes from a finger blocking the camera. That is out of the training distribution, in general leads to a low data quality, and should be avoided if possible.

@hkchengrex
Copy link
Owner

Feel free to re-open if there are any updates.

@charchit7
Copy link
Author

Got it. thank you for the detailed response, @hkchengrex.
I got stuck at work so couldn't update you, will update you on this after trying the changes. Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants