-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate Segment Anything Model #43
Comments
I have started working on this task. I am going to piggy back on the work Igor has done for the CV pipeline to add the segment anything model. I ran FastSAM on datahub, I think it was using 256 CPU cores and it took: |
FastSAM is packaged by ultralytics, which makes it dead simple to use in Python https://docs.ultralytics.com/models/fast-sam/. I am still planning on taking advantage of Igor's work on the CV pipeline to also do inference with FastSAM. However, looking at the experience people are having dealing with pytorch in cpp, I think using an IPC library would be a good idea to be able to use multiple languages. I will make a decision later, but for just inference, our existing infrastructure may work out fine. |
Will be done by #104 |
This issue will look into doing obstacle detection using SAM, which is a more advanced model than YOLO. The main benefit is that we would not have to fine tune the model at all for out-of-scope object detection.
I remember wondering if we should switch YOLO to Meta's SAM when it came out. Well, luckily for us a lot of development has happened there. There are now real time versions which are super cool how they made it https://github.com/CASIA-IVA-Lab/FastSAM and it's so easy to use https://docs.ultralytics.com/models/fast-sam/#installation . We get the best of real time YOLO and SAM out of data segmenting ability.
I found out about this model from this paper ⭐⭐⭐which I would HIGHLY recommend reading. It is very readable because it is an application paper not a ML theory paper. We would need some async way to initially detect dynamic obstacles, but what they do in the paper can do the remaining real time tracking. A few notes: they are doing this at low resolution (would need to test for our use case), the object memory system for CV is a very smart idea I haven't heard about before.
Resources:
The text was updated successfully, but these errors were encountered: