-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST Refactor: clean persistence tests to avoid duplicate testing #202
TST Refactor: clean persistence tests to avoid duplicate testing #202
Conversation
Signed-off-by: Benjamin Bossan <benjamin.bossan@gmail.com>
@skops-dev/maintainers ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, otherwise LGTM.
return self | ||
|
||
|
||
def test_dump_to_and_load_from_disk(tmp_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe as an alternative, or another test, we could save/load from disk and memory, and compare the objects' hashes to make sure they're the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can add another test. To be sure I understand correctly, you mean comparing the hashes of both objects in memory, one obtained through dump/load
and one for dumps/loads
? If so, what method would you use to hash them, joblib.hash
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Resolves #198
After introducing
dumps
andloads
, we ran all persistence tests twice, once with those functions and once withdump
andload
. This is inefficient, since they're basically testing the same thing but running 400+ tests twice is costly. With this PR, we only use the more efficientdumps
andloads
. On top of that, there is one test that very specifically testsdump
andload
.