-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix deterministic group_shuffle_split #1839
Conversation
Let's find a way to test this so the same bug doesn't happen again. |
@nilsleh Would love to get this in the 0.5.2 release! |
Thanks for the reminder :) |
@adamjstewart and @isaaccorley not sure how to simulate repeated calls to the function after restarting script/kernel, so I thought separate processes might be a way to go, but actually not sure |
This feels like overkill. Depending on the size of our fake dataset, can we just run the test once, print the order, then hardcode that in the test code? As long as it is always the same, it's deterministic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better, thanks!
* order sets * suggestion * add unit test * fix * updated test * fix * indices from file * test util update * path * no file * no file * comment * i cannot spell
Sets are unordered and therefore, repeated calls were yielding different train and val sizes for Cyclone dataset.