-
-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving the performance of check_datasets_active, modifying unit test #980
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍 only two minor remarks
openml/datasets/functions.py
Outdated
|
||
Returns | ||
------- | ||
dict | ||
A dictionary with items {did: bool} | ||
""" | ||
dataset_list = list_datasets(status="all") | ||
dataset_list = list_datasets( | ||
dataset_ids=dataset_ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataset_ids=dataset_ids, | |
data_id=dataset_ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is a rather important one that hadn't caught my eye first time around 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, I got a suggestion for that and the function had no argument like that at all :S. Nice catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like because of the **kwargs
(used to allow all filters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (pending CI) 👍 edit: oops, that would be an accept
Reference Issue
Fixes #671
What does this PR implement/fix? Explain your changes.
Improves the performance of the function
check_datasets_active
by querying only for the given datasets, instead of all datasets. Furthemore, the error throwing behavior can be controller by the user.How should this PR be tested?
Existing unit test which was modified.