Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter parameters to list_models() #7718

Merged
merged 14 commits into from
Jul 5, 2023

Conversation

MateuszGuzek
Copy link
Contributor

Addressing Issue / comment: #6365 (comment)

We want to add a filtering mechanism to retrieve only a subset of models.

Includes include and exclude models functionalities.

[x] Documentation updated
[x] Test created

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 3, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7718

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit 7ca68d9:

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base 2d4484f:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR @MateuszGuzek . I made some comments and suggestions below. The implementation looks great! For the tests I have some reservations at this point: the main one being that the tests seem to be a lot more complex than the code they're testing. I made some suggestions which hopefully will simplify a few things, we'll see how it goes. Definitely LMK what you think.

Thanks!

torchvision/models/_api.py Outdated Show resolved Hide resolved
torchvision/models/_api.py Outdated Show resolved Hide resolved
torchvision/models/_api.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
k for k, v in BUILTIN_MODELS.items() if module is None or v.__module__.rsplit(".", 1)[0] == module.__name__
]
if include_filters is not None:
Copy link
Member

@NicolasHug NicolasHug Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest

Suggested change
if include_filters is not None:
if include_filters:

So that cases like list_models(include_filters=[]) are equivalent to list_models(). (same below).

Most people won't be writing list_models(include_filters=[]) directly but it's possible to write list_models(include_filters=my_filter) where my_filter ends up being an empty list, or an empty string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this one I am not entirely sure, it is an edge case. Default is None, and it means include all. But if not None is passed, it is an explicit Iterable of what should be included.

The more strict the pattern, the less models you should return, with extreme cases of exact match, which result in a single model, and then one step further is to have an empty list, which for the sake of consistency could return nothing.

I imagine use case where you produce a list of include filters using set operations, and some of them may produce empty lists. Then I think it may be safer to fail fast (by ending up with no models), instead of suddenly flipping to accepting all possible models.

Please advise :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MateuszGuzek , that makes sense. I guess there are cases to be made for both options.

@pmeier @vfdev-5 any thought on that? Basically we're trying to decide whether list_models(include=[]) should be interpreted as A) "there's no filter, so don't filter anything, return the whole list" or B) as "there's no filter, so there's nothing to include, so we return nothing".

Note that timm uses A (which IMHO is a good reason to go for that):

https://github.com/huggingface/pytorch-image-models/blob/c241081251323dfc5e8dc799d49740c48cc9096f/timm/models/_registry.py#L205

Copy link
Collaborator

@pmeier pmeier Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the feature request in #6365 (comment) specifically mentions

timm has this functionality and it really boosts productivity.

we should align with what they have, i.e. go for A

Copy link
Contributor Author

@MateuszGuzek MateuszGuzek Jul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should align with what they have, i.e. go for A

This is a fair point, I will go for this implementation.

test/test_extended_models.py Outdated Show resolved Hide resolved
@MateuszGuzek MateuszGuzek requested a review from NicolasHug July 4, 2023 14:20
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MateuszGuzek, made another pass!

test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
torchvision/models/_api.py Outdated Show resolved Hide resolved
test/test_extended_models.py Outdated Show resolved Hide resolved
MateuszGuzek and others added 5 commits July 5, 2023 09:36
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
@MateuszGuzek
Copy link
Contributor Author

Thanks for the comments, I applied all of them, it looks better now. :)

@MateuszGuzek MateuszGuzek requested a review from NicolasHug July 5, 2023 08:19
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @MateuszGuzek , LGTM! I just pushed some minor cosmetics changes in the last commit (it's easier to address directly than to make another round of comments).

We still have a mypy error:

(pt) ➜  vision git:(filter_list_models) ✗ mypy torchvision/models/_api.py
torchvision/models/_api.py:231: error: Need type annotation for "models" (hint: "models: Set[<type>] = ...") 
[var-annotated]
            models = set()
            ^~~~~~
torchvision/models/_api.py:237: error: Incompatible types in assignment (expression has type "List[str]", variable has
type "Set[Union[Any, str]]")  [assignment]
            models = all_models
                     ^~~~~~~~~~

It is apparently too difficult for mypy to understand that sorted(some_set) returns a list. I'd suggest to silence mypy on this entire file unless @pmeier has a better solution.

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the mypy fix @pmeier I'll merge when green

@NicolasHug NicolasHug changed the title Filter list models Add filter parameters to list_models() Jul 5, 2023
@pmeier pmeier merged commit cbc36eb into pytorch:main Jul 5, 2023
@MateuszGuzek MateuszGuzek deleted the filter_list_models branch July 5, 2023 12:41
facebook-github-bot pushed a commit that referenced this pull request Aug 25, 2023
Reviewed By: matteobettini

Differential Revision: D48642263

fbshipit-source-id: 7dd986c91115b47383dfa69af070626a85b8bf07

Co-authored-by: Mateusz Guzek <matguzek@meta.com>
Co-authored-by: Nicolas Hug <contact@nicolas-hug.com>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants