-
Notifications
You must be signed in to change notification settings - Fork 750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MP DataLoader Improvements #742
MP DataLoader Improvements #742
Conversation
@vafl maybe you could check wether you run into problems, or see improvements? |
Codecov Report
@@ Coverage Diff @@
## master #742 +/- ##
==========================================
+ Coverage 84.45% 85.51% +1.05%
==========================================
Files 171 171
Lines 10817 10862 +45
==========================================
+ Hits 9136 9289 +153
+ Misses 1681 1573 -108
|
[Outdated] So for WaveNetEstimator i get 4x performance if The question is whether to leave the default 8 or 1. So far this arguments default was 8, on the other hand, there was a bug in the previous data loader code (before multiprocessing), namely:
which means that only And the model training seems to have worked that way fine too. However, I can imagine fixing this bug could yield better performance now in some cases. Any preference @vafl ? |
[Outdated] Regarding the read speeds, if I use
I consistently get the following results:
However, while the difference in speed is measurable, it's also negligible. |
[Outdated] And setting For example, for this configuration on the electricity dataset:
I get ~17 vs 24 it/sec uncached vs cached. |
With the last commit I changed the functionality of
In order to achieve this, I had to switch back to the cache aligned data iteration, because in the none cached FileDataset the difference between the techniques is significantly noticeable (since now |
With this I think every feature is added that was missing (that I am aware of), and I tested every design choice that I though needed validation. |
I have some minor comments, see inline |
…pieler/gluon-ts into mp_data_loader_updates_V2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
I merged upstream changes to let the tests run again, there was a timeout in on test which I want to make sure is not occurring consistently. @AaronSpieler any idea where these time outs come from? |
The windows timeout error? IDK: It should be non mp: "You have set |
Should be fixed now. The error came from mp evaluation being enabled on windows. I think this was shadowed by the continual windows errors we were getting so far from other commits. |
So windows is still failing, however, it has nothing to do with multiprocessing. So I think we could at some point reenable multiprocessing evaluation for windows, however, how the error is now more clearly visible because its not happening in the forked processes anymore:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks!
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.