Running SAM backbone on frontend #6019

bsekachev · 2023-04-13T12:06:46Z

Motivation and context

Resolved #5984
Resolved #6049
Resolved #6041

Compatible only with sam_vit_h_4b8939.pth weights. Need to re-export ONNX mask decoder with some custom model changes (see below) to support other weights (or just download them using links below)
Need to redeploy the serverless function because its interface has been changed: ./deploy_gpu.sh pytorch/facebookresearch/sam/nuclio/

Decoders for other weights:
sam_vit_l_0b3195.pth: Download
sam_vit_b_01ec64.pth: Download

Changes done in ONNX part:

git diff scripts/export_onnx_model.py
diff --git a/scripts/export_onnx_model.py b/scripts/export_onnx_model.py
index 8441258..18d5be7 100644
--- a/scripts/export_onnx_model.py
+++ b/scripts/export_onnx_model.py
@@ -138,7 +138,7 @@ def run_export(

     _ = onnx_model(**dummy_inputs)

-    output_names = ["masks", "iou_predictions", "low_res_masks"]
+    output_names = ["masks", "iou_predictions", "low_res_masks", "xtl", "ytl", "xbr", "ybr"]

     with warnings.catch_warnings():
         warnings.filterwarnings("ignore", category=torch.jit.TracerWarning)
bsekachev@DESKTOP-OTBLK26:~/sam$ git diff segment_anything/utils/onnx.py
diff --git a/segment_anything/utils/onnx.py b/segment_anything/utils/onnx.py
index 3196bdf..85729c1 100644
--- a/segment_anything/utils/onnx.py
+++ b/segment_anything/utils/onnx.py
@@ -87,7 +87,15 @@ class SamOnnxModel(nn.Module):
         orig_im_size = orig_im_size.to(torch.int64)
         h, w = orig_im_size[0], orig_im_size[1]
         masks = F.interpolate(masks, size=(h, w), mode="bilinear", align_corners=False)
-        return masks
+        masks = torch.gt(masks, 0).to(torch.uint8)
+        nonzero = torch.nonzero(masks)
+        xindices = nonzero[:, 3:4]
+        yindices = nonzero[:, 2:3]
+        ytl = torch.min(yindices).to(torch.int64)
+        ybr = torch.max(yindices).to(torch.int64)
+        xtl = torch.min(xindices).to(torch.int64)
+        xbr = torch.max(xindices).to(torch.int64)
+        return masks[:, :, ytl:ybr + 1, xtl:xbr + 1], xtl, ytl, xbr, ybr

     def select_masks(
         self, masks: torch.Tensor, iou_preds: torch.Tensor, num_points: int
@@ -132,7 +140,7 @@ class SamOnnxModel(nn.Module):
         if self.return_single_mask:
             masks, scores = self.select_masks(masks, scores, point_coords.shape[1])

-        upscaled_masks = self.mask_postprocessing(masks, orig_im_size)
+        upscaled_masks, xtl, ytl, xbr, ybr = self.mask_postprocessing(masks, orig_im_size)

         if self.return_extra_metrics:
             stability_scores = calculate_stability_score(
@@ -141,4 +149,4 @@ class SamOnnxModel(nn.Module):
             areas = (upscaled_masks > self.model.mask_threshold).sum(-1).sum(-1)
             return upscaled_masks, scores, stability_scores, areas, masks

-        return upscaled_masks, scores, masks
+        return upscaled_masks, scores, masks, xtl, ytl, xbr, ybr

How has this been tested?

Checklist

I submit my changes into the develop branch
I have added a description of my changes into the CHANGELOG file
I have updated the documentation accordingly
I have added tests to cover my changes
I have linked related issues (see GitHub docs)
I have increased versions of npm packages if it is necessary
(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.

cvat-ui/plugins/sam_plugin/src/ts/index.tsx

cvat-ui/src/components/annotation-page/standard-workspace/controls-side-bar/tools-control.tsx

d710055071 · 2023-05-16T08:11:18Z

Does this implement operations related to point prompt, box prompt, point+box prompt, and mask

### Added - Introduced a new configuration option for controlling the invocation of Nuclio functions. (<#6146>) ### Changed - Relocated SAM masks decoder to frontend operation. (<#6019>) - Switched `person-reidentification-retail-0300` and `faster_rcnn_inception_v2_coco` Nuclio functions with `person-reidentification-retail-0277` and `faster_rcnn_inception_resnet_v2_atrous_coco` respectively. (<#6129>) - Upgraded OpenVINO-based Nuclio functions to utilize the OpenVINO 2022.3 runtime. (<#6129>) ### Fixed - Resolved issues with tracking multiple objects (30 and more) using the TransT tracker. (<#6073>) - Addressed azure.core.exceptions.ResourceExistsError: The specified blob already exists. (<#6082>) - Corrected image scaling issues when transitioning between images of different resolutions. (<#6081>) - Fixed inaccurate reporting of completed job counts. (<#6098>) - Allowed OpenVINO-based Nuclio functions to be deployed to Kubernetes. (<#6129>) - Improved skeleton size checks after drawing. (<#6156>) - Fixed HRNet CPU serverless function. (<#6150>) - Prevented sending of empty list of events. (<#6154>)

### Motivation and context Resolved cvat-ai#5984 Resolved cvat-ai#6049 Resolved cvat-ai#6041 - Compatible only with ``sam_vit_h_4b8939.pth`` weights. Need to re-export ONNX mask decoder with some custom model changes (see below) to support other weights (or just download them using links below) - Need to redeploy the serverless function because its interface has been changed. Decoders for other weights: sam_vit_l_0b3195.pth: [Download](https://drive.google.com/file/d/1Nb5CJKQm_6s1n3xLSZYso6VNgljjfR-6/view?usp=sharing) sam_vit_b_01ec64.pth: [Download](https://drive.google.com/file/d/17cZAXBPaOABS170c9bcj9PdQsMziiBHw/view?usp=sharing) Changes done in ONNX part: ``` git diff scripts/export_onnx_model.py diff --git a/scripts/export_onnx_model.py b/scripts/export_onnx_model.py index 8441258..18d5be7 100644 --- a/scripts/export_onnx_model.py +++ b/scripts/export_onnx_model.py @@ -138,7 +138,7 @@ def run_export( _ = onnx_model(**dummy_inputs) - output_names = ["masks", "iou_predictions", "low_res_masks"] + output_names = ["masks", "iou_predictions", "low_res_masks", "xtl", "ytl", "xbr", "ybr"] with warnings.catch_warnings(): warnings.filterwarnings("ignore", category=torch.jit.TracerWarning) bsekachev@DESKTOP-OTBLK26:~/sam$ git diff segment_anything/utils/onnx.py diff --git a/segment_anything/utils/onnx.py b/segment_anything/utils/onnx.py index 3196bdf..85729c1 100644 --- a/segment_anything/utils/onnx.py +++ b/segment_anything/utils/onnx.py @@ -87,7 +87,15 @@ class SamOnnxModel(nn.Module): orig_im_size = orig_im_size.to(torch.int64) h, w = orig_im_size[0], orig_im_size[1] masks = F.interpolate(masks, size=(h, w), mode="bilinear", align_corners=False) - return masks + masks = torch.gt(masks, 0).to(torch.uint8) + nonzero = torch.nonzero(masks) + xindices = nonzero[:, 3:4] + yindices = nonzero[:, 2:3] + ytl = torch.min(yindices).to(torch.int64) + ybr = torch.max(yindices).to(torch.int64) + xtl = torch.min(xindices).to(torch.int64) + xbr = torch.max(xindices).to(torch.int64) + return masks[:, :, ytl:ybr + 1, xtl:xbr + 1], xtl, ytl, xbr, ybr def select_masks( self, masks: torch.Tensor, iou_preds: torch.Tensor, num_points: int @@ -132,7 +140,7 @@ class SamOnnxModel(nn.Module): if self.return_single_mask: masks, scores = self.select_masks(masks, scores, point_coords.shape[1]) - upscaled_masks = self.mask_postprocessing(masks, orig_im_size) + upscaled_masks, xtl, ytl, xbr, ybr = self.mask_postprocessing(masks, orig_im_size) if self.return_extra_metrics: stability_scores = calculate_stability_score( @@ -141,4 +149,4 @@ class SamOnnxModel(nn.Module): areas = (upscaled_masks > self.model.mask_threshold).sum(-1).sum(-1) return upscaled_masks, scores, stability_scores, areas, masks - return upscaled_masks, scores, masks + return upscaled_masks, scores, masks, xtl, ytl, xbr, ybr ``` ### How has this been tested?  ### Checklist  - [x] I submit my changes into the `develop` branch - [x] I have added a description of my changes into the [CHANGELOG](https://github.com/opencv/cvat/blob/develop/CHANGELOG.md) file - [ ] I have updated the documentation accordingly - [ ] I have added tests to cover my changes - [x] I have linked related issues (see [GitHub docs]( https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)) - [x] I have increased versions of npm packages if it is necessary ([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning), [cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning), [cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning) and [cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning)) ### License - [x] I submit _my code changes_ under the same [MIT License]( https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern.

bsekachev added 5 commits April 13, 2023 15:05

First draft solution to run sam backbone on client

e0159f8

Added decoder, removed extra code

ce70b7f

Using prev mask, supported neg points

6aa3cb6

Optimized performance, fixed eslint issues

5d64332

Reworked as a plugin

ebac225

bsekachev mentioned this pull request Apr 25, 2023

Better Integration for Neural Embedding based workflowes (eg: SAM (Segment Anything)) #6049

Closed

2 tasks

bsekachev added 2 commits April 25, 2023 20:32

Fixed eslint issues

a2ce632

Changed contours methods, reduced code dublications

0ea70f3

This was referenced Apr 26, 2023

Gateway Timeout (504) error when running with SAM #6041

Closed

SAM model takes too long to return remotely with low internet bandwidth #6094

Closed

bsekachev added 3 commits May 9, 2023 14:20

Merge branch 'develop' into bs/sam_ui

32e9029

Added default plugin

2c13a4e

Using lru cache

d2e4b3a

bsekachev marked this pull request as ready for review May 9, 2023 14:04

bsekachev requested review from yasakova-anastasia, Marishka17 and nmanovic as code owners May 9, 2023 14:04

bsekachev changed the title ~~[WIP] Running SAM backbone on frontend~~ [Do not merge] Running SAM backbone on frontend May 9, 2023

bsekachev added 2 commits May 9, 2023 17:08

Fixed eslint errors

9ac5ca6

Enable isolated origion on production

edd3f76

bsekachev requested a review from azhavoro as a code owner May 10, 2023 08:35

bsekachev added 3 commits May 10, 2023 12:22

Updated model

26dbd57

Fixed pylint issue

bf98788

Adjusted quality and response from mask decoder

9f523bd

klakhov reviewed May 10, 2023

View reviewed changes

cvat-ui/plugins/sam_plugin/src/ts/index.tsx Outdated Show resolved Hide resolved

cvat-ui/src/components/annotation-page/standard-workspace/controls-side-bar/tools-control.tsx Outdated Show resolved Hide resolved

bsekachev added 5 commits May 11, 2023 10:29

Do not trigger extra request when no points clicked

19f39ce

Merged develop

2096be6

Added type for return value

fbac50a

Added failed initialization message

d55ebd5

Updated license headers

8a57209

Updated version & changelog

2ac4b9d

bsekachev requested a review from mdacoca as a code owner May 11, 2023 07:54

bsekachev added 2 commits May 11, 2023 11:30

Fixed re-exporting type

57ef058

Redesigned method of getting model scale

05d2329

bsekachev changed the title ~~[Do not merge] Running SAM backbone on frontend~~ Running SAM backbone on frontend May 11, 2023

bsekachev merged commit 0712d7d into develop May 11, 2023

bsekachev mentioned this pull request May 11, 2023

How to specify GPU id in nuclio functions? #6104

Open

bsekachev deleted the bs/sam_ui branch May 16, 2023 08:54

azhavoro mentioned this pull request May 18, 2023

Release v2.4.4 #6167

Merged

p0mad mentioned this pull request May 20, 2023

Is there any tracker segmentation for human annotaion in videos? #6174

Closed

2 tasks

lll9p mentioned this pull request May 24, 2023

[SAM]: TypeError: Cannot read properties of undefined (reading 'length') #6189

Closed

bsekachev mentioned this pull request Jun 27, 2023

Can we change the SAM weights? #6380

Closed

KTXKIKI mentioned this pull request Aug 4, 2024

Introduce Segment Anything 2 #8243

Closed

4 tasks

xx674967 mentioned this pull request Sep 30, 2024

Different SAM model not giving correct masks #8158

Closed

2 tasks

hashJoe mentioned this pull request Oct 16, 2024

SAM: How does the decoder handle output resolution? #8545

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running SAM backbone on frontend #6019

Running SAM backbone on frontend #6019

bsekachev commented Apr 13, 2023 •

edited

Loading

d710055071 commented May 16, 2023

Running SAM backbone on frontend #6019

Running SAM backbone on frontend #6019

Conversation

bsekachev commented Apr 13, 2023 • edited Loading

Motivation and context

How has this been tested?

Checklist

License

d710055071 commented May 16, 2023

bsekachev commented Apr 13, 2023 •

edited

Loading