Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU-Suport: Mask-RCNN + Minor GPU fixes #2714

Merged
merged 24 commits into from
Feb 16, 2021
Merged

GPU-Suport: Mask-RCNN + Minor GPU fixes #2714

merged 24 commits into from
Feb 16, 2021

Conversation

jahaniam
Copy link
Contributor

@jahaniam jahaniam commented Jan 25, 2021

Summary:

  • Added GPU support for mask-rcnn
  • More robust gpu execution: Limited the GPU memory to 33.3% of the total GPU memory per worker and 1 worker per function for GPU
  • Improved mask-rcnn deployment speed by removing extra packages
  • Boosted nuclio to 1.5.16

Here is a short result on my laptop with Nvidia 1060m :

  • Mask-RCNN GPU: ~1.2 Sec / Image
  • Mask-RCNN CPU: ~7.5 Sec / Image

Related Issues:
#2635,
#2489,

Boosted to nuclio to 1.5.16, based on nuclio/nuclio#2058 , processorMountMode is deprecated and is replaced with mountMode, therefore, #2578 needs to be revisited.

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2021 Intel Corporation
#
# SPDX-License-Identifier: MIT

@coveralls
Copy link

coveralls commented Jan 25, 2021

Coverage Status

Coverage increased (+0.02%) to 69.774% when pulling cef255e on jahaniam:develop into 897267c on openvinotoolkit:develop.

@azhavoro
Copy link
Contributor

@jahaniam Thanks for the your contribution!

@jahaniam
Copy link
Contributor Author

@jahaniam Thanks for the your contribution!

Thank you and your team.

@nmanovic
Copy link
Contributor

@jahaniam , could you please help us to fix codacy issues in the PR?

vat/apps/documentation/installation_automatic_annotation.md
[maximum-line-length] Line must be at most 120 characters
Also you will need to add `--resource-limit nvidia.com/gpu=1 --triggers '{"myHttpTrigger": {"maxWorkers": 1}}'` to the nuclio deployment command. You can increase the maxWorker if you have enough GPU memory.
[maximum-line-length] Line must be at most 120 characters
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
[list-item-content-indent] Don’t use mixed indentation for children, remove 2 spaces
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
[list-item-indent] Incorrect list-item indent: add 2 spaces
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py
Trailing whitespace

build:
image: cvat/tf.matterport.mask_rcnn
baseImage: tensorflow/tensorflow:2.1.0-py3
baseImage: tensorflow/tensorflow:1.13.1-py3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version of tensorflow 1.x is 1.15.5 and I see the comment Note that this is the last patch release for the TensorFlow 1.x series.
What is a reason to move from 2.1 to 1.13.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before you were installing RUN pip install tensorflow==1.13.1 . RUN can't be overwritten using nuctl for gpu. If I use tensorflow 2.1-GPU I was getting some errors. Besides the mask-rcnn code is in tensorflow 1.x. It is better to have the docker for 1.x .
There was no reason not to go with v 1.15.5, we might be able to switch to that version probably. I just needed a version that works fine. I used 1.13.1 because it was being installed on line 119.

@nmanovic
Copy link
Contributor

@jahaniam , the patch looks great! Let's clarify a couple of moments and merge.

Copy link
Contributor Author

@jahaniam jahaniam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jahaniam , could you please help us to fix codacy issues in the PR?

vat/apps/documentation/installation_automatic_annotation.md
[maximum-line-length] Line must be at most 120 characters
Also you will need to add `--resource-limit nvidia.com/gpu=1 --triggers '{"myHttpTrigger": {"maxWorkers": 1}}'` to the nuclio deployment command. You can increase the maxWorker if you have enough GPU memory.
[maximum-line-length] Line must be at most 120 characters
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
[list-item-content-indent] Don’t use mixed indentation for children, remove 2 spaces
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
[list-item-indent] Incorrect list-item indent: add 2 spaces
- Since the model is loaded during deployment, the number of GPU deployed functions will be limited to your GPU memory.
serverless/tensorflow/faster_rcnn_inception_v2_coco/nuclio/model_loader.py
Trailing whitespace

Do you know how I can recheck it? I did some changes, I wanna make sure it is ok then commit it.

@jahaniam
Copy link
Contributor Author

jahaniam commented Feb 1, 2021

All comments are addressed.

The only thing remaining is the codacy complaining about two spaces for the list item (my indentings are correct). I am not able to fix that. If you can fix it please go ahead.

@jahaniam jahaniam requested a review from nmanovic February 1, 2021 02:26
@nmanovic
Copy link
Contributor

Hi @jahaniam , I finally merged my old patch with IOG serverless function (#2578). Could you please adjust your PR?

@jahaniam
Copy link
Contributor Author

Hi @jahaniam , I finally merged my old patch with IOG serverless function (#2578). Could you please adjust your PR?

Awesome. I was looking forward to that merge for a while. I'll look into it this weekend.

@jahaniam
Copy link
Contributor Author

fixed conflicts and tested. It's working as expected. Please review @nmanovic @azhavoro

Hi @jahaniam , I finally merged my old patch with IOG serverless function (#2578). Could you please adjust your PR?

@nmanovic
Copy link
Contributor

@jahaniam , thanks for the great contribution! Really appreciate your time and efforts.

@nmanovic nmanovic merged commit 59c3b28 into cvat-ai:develop Feb 16, 2021
kenu pushed a commit to kenu/cvat that referenced this pull request Feb 23, 2021
* fixed cpu mask rcnn+preparation for gpu
* fix-limit gpu memory to 30% of total memory per worker

Co-authored-by: Nikita Manovich <nikita.manovich@intel.com>
@valavanisleonidas
Copy link

Hello,

@jahaniam you reported these times

Mask-RCNN GPU: ~1.2 Sec / Image
Mask-RCNN CPU: ~7.5 Sec / Image

this time is to use the model to predict one image annotations ??

@jahaniam
Copy link
Contributor Author

jahaniam commented Mar 9, 2021

Hello,

@jahaniam you reported these times

Mask-RCNN GPU: ~1.2 Sec / Image
Mask-RCNN CPU: ~7.5 Sec / Image

this time is to use the model to predict one image annotations ??

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants