Investigate Segment Anything Model #43

Samir-Rashid · 2023-10-28T06:41:30Z

This issue will look into doing obstacle detection using SAM, which is a more advanced model than YOLO. The main benefit is that we would not have to fine tune the model at all for out-of-scope object detection.

I remember wondering if we should switch YOLO to Meta's SAM when it came out. Well, luckily for us a lot of development has happened there. There are now real time versions which are super cool how they made it https://github.com/CASIA-IVA-Lab/FastSAM and it's so easy to use https://docs.ultralytics.com/models/fast-sam/#installation . We get the best of real time YOLO and SAM out of data segmenting ability.

I found out about this model from this paper ⭐⭐⭐which I would HIGHLY recommend reading. It is very readable because it is an application paper not a ML theory paper. We would need some async way to initially detect dynamic obstacles, but what they do in the paper can do the remaining real time tracking. A few notes: they are doing this at low resolution (would need to test for our use case), the object memory system for CV is a very smart idea I haven't heard about before.

Resources:

https://flowforward.simple.ink/ good visualization of the model

Samir-Rashid · 2024-02-22T23:45:55Z

I have started working on this task. I am going to piggy back on the work Igor has done for the CV pipeline to add the segment anything model. I ran FastSAM on datahub, I think it was using 256 CPU cores and it took:
Speed: 1188.5ms preprocess, 417401.7ms inference, 11499.6ms postprocess per image at shape (1, 3, 1024, 1024)
I will try using this Nvidia repo which they claim runs in real time https://github.com/NVIDIA-AI-IOT/nanosam. I will have to verify the performance difference of using the "mobile" version of the model.

Samir-Rashid · 2024-02-26T02:20:40Z

FastSAM is packaged by ultralytics, which makes it dead simple to use in Python https://docs.ultralytics.com/models/fast-sam/. I am still planning on taking advantage of Igor's work on the CV pipeline to also do inference with FastSAM. However, looking at the experience people are having dealing with pytorch in cpp, I think using an IPC library would be a good idea to be able to use multiple languages. I will make a decision later, but for just inference, our existing infrastructure may work out fine.

Samir-Rashid · 2024-04-05T04:11:35Z

Will be done by #104

Samir-Rashid added feature New feature or request research labels Oct 28, 2023

Samir-Rashid self-assigned this Oct 28, 2023

Samir-Rashid closed this as completed Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate Segment Anything Model #43

Investigate Segment Anything Model #43

Samir-Rashid commented Oct 28, 2023 •

edited

Loading

Samir-Rashid commented Feb 22, 2024

Samir-Rashid commented Feb 26, 2024

Samir-Rashid commented Apr 5, 2024

Investigate Segment Anything Model #43

Investigate Segment Anything Model #43

Comments

Samir-Rashid commented Oct 28, 2023 • edited Loading

Samir-Rashid commented Feb 22, 2024

Samir-Rashid commented Feb 26, 2024

Samir-Rashid commented Apr 5, 2024

Samir-Rashid commented Oct 28, 2023 •

edited

Loading