Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support : Leverage Accelerate for object detection/segmentation models #28312

Merged
merged 14 commits into from
Feb 16, 2024

Conversation

Tanmaypatil123
Copy link
Contributor

@Tanmaypatil123 Tanmaypatil123 commented Jan 2, 2024

What does this PR do?

Adding support for multi-GPU training in 6 object detection models and segmentation models.

Fixes # #28309

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@NielsRogge

@Tanmaypatil123 Tanmaypatil123 changed the title [WIP] : Leverage Accelerate for object detection/segmentation models Support : Leverage Accelerate for object detection/segmentation models Jan 2, 2024
@NielsRogge
Copy link
Contributor

Hi, thanks for your PR. It looks like there's an issue with the segmentation models.

@Tanmaypatil123
Copy link
Contributor Author

@NielsRogge could you take a look at this PR again . I have made required changes for segmentation models.

@NielsRogge
Copy link
Contributor

Thanks, LGTM.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @Tanmaypatil123!

Just a comment on the imports

Comment on lines 23 to 24
from accelerate import PartialState
from accelerate.utils import reduce
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be behind a protected import:

if is_accelerate_available():
    from accelerate import PartialState
    from accelerate.utils import reduce

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! Changes look good to me.

Before merging could you:

  • Run the slow tests for these models to confirm everything still passes? e.g. RUN_SLOW=1 pytest tests/models/conditional_detr/
  • Test at least one of these models in a single and multi-GPU environment with the loss calculation i.e. labels passed in?
  • Resolve conflicts

@Tanmaypatil123
Copy link
Contributor Author

hey @amyeroberts removed conflicts .but whenever I tried to run test cases by RUN_SLOW=1 pytest tests/models/conditional_detr/ I am getting following error :
image

@amyeroberts
Copy link
Collaborator

@Tanmaypatil123 - Not sure exactly what's causing this. The function is_g2p_en_available was added in #23439. However, we haven't seen this import error in our CI runs which validate slow tests and I'm able to run RUN_SLOW=1 pytest tests/models/conditional_detr/ locally on main.

In a python session, are you about to run:

from transformers.utils import is_g2p_en_available

?

Could you try rebasing on main to make sure all updates/fixes are included in this branch?

Copy link

github-actions bot commented Feb 2, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@ArthurZucker
Copy link
Collaborator

The issue is related to evaluate I think: huggingface/peft#1351

@ArthurZucker
Copy link
Collaborator

cc @Tanmaypatil123 let's rebase this and I think we could merge this!

@Tanmaypatil123
Copy link
Contributor Author

@ArthurZucker Changes are approved but there is issue with segmentation model; test cases fails for it and i could not figure out what the problem is.

@NielsRogge
Copy link
Contributor

@Tanmaypatil123 you just need to fix the upstream conflicts, like so:

git remote add upstream https://github.com/huggingface/transformers.git
git fetch upstream
git merge upstream/main

@Tanmaypatil123
Copy link
Contributor Author

@ArthurZucker I have resolved the conflicts

@amyeroberts
Copy link
Collaborator

@Tanmaypatil123 Great!

For the failing tests, there's two which look like they might be related to this PR which will needed to be addressed before merge:

FAILED tests/models/yolos/test_modeling_yolos.py::YolosModelTest::test_pipeline_image_feature_extraction - TypeError: forward() got an unexpected keyword argument 'pixel_mask'
FAILED tests/models/yolos/test_modeling_yolos.py::YolosModelTest::test_pipeline_object_detection - TypeError: forward() got an unexpected keyword argument 'pixel_mask'

For the build documentation tests, there was a recent fix merged into main. Rebasing to include this should resolve.

@Tanmaypatil123
Copy link
Contributor Author

@Tanmaypatil123 Great!

For the failing tests, there's two which look like they might be related to this PR which will needed to be addressed before merge:

FAILED tests/models/yolos/test_modeling_yolos.py::YolosModelTest::test_pipeline_image_feature_extraction - TypeError: forward() got an unexpected keyword argument 'pixel_mask'
FAILED tests/models/yolos/test_modeling_yolos.py::YolosModelTest::test_pipeline_object_detection - TypeError: forward() got an unexpected keyword argument 'pixel_mask'

For the build documentation tests, there was a recent fix merged into main. Rebasing to include this should resolve.

Don't know how that test case is failing. I didn't make any changes to the parameters that are provided to YolosForObjectDetection. should we add parameters or is there any change in the test case?

@amyeroberts
Copy link
Collaborator

@Tanmaypatil123 Indeed. It's quite weird these tests are failing as they don't seem related to this PR. They aren't however failing on any other PRs. What I think is happening is that some tests weren't collected by the test fetched (cc @ydshieh for reference) when the image feature extraction pipeline was added. And now, because this PR touches yolos they're being run now. I'm going to look into it and let you know asap.

@Tanmaypatil123
Copy link
Contributor Author

@amyeroberts Done 😀 Thanks for the help. Huggingface has the best open source maintainers.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@amyeroberts
Copy link
Collaborator

@Tanmaypatil123 Thanks for this great contribution and patience with our failing CI tests!

@amyeroberts amyeroberts merged commit 0eb4085 into huggingface:main Feb 16, 2024
18 checks passed
@amyeroberts amyeroberts mentioned this pull request Feb 19, 2024
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Feb 19, 2024
huggingface#28312)

* made changes for object detection models

* added support for segmentation models.

* Made changes for segmentaion models

* Changed import statements

* solving conflicts

* removed conflicts

* Resolving commits

* Removed conflicts

* Fix : Pixel_mask_value set to False
itazap pushed a commit that referenced this pull request May 14, 2024
#28312)

* made changes for object detection models

* added support for segmentation models.

* Made changes for segmentaion models

* Changed import statements

* solving conflicts

* removed conflicts

* Resolving commits

* Removed conflicts

* Fix : Pixel_mask_value set to False
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants