-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the performance of create_virtual_dataset for h5py 3 #232
Improve the performance of create_virtual_dataset for h5py 3 #232
Conversation
Using VirtualLayout and VirtualSource directly is slow for a few reasons. The main reason is that VirtualSource does a deepcopy of itself every time you slice it. The high-level code also calls the internal helper _convert_space_for_key(), which is unnecessary for our given slices (it only does anything for unbounded slices), and recreates a selection object each time, which we can do just once. Any further performance improvements here would have to come from speeding up the h5py select() function, which is where most of the time is now spent.
Thanks, this looks much better. On my machine the example from #226 now runs in 1.81s instead of 2.7s. |
So now the latest version of h5py is causing the tests to segfault. I'm not sure what the h5py 3.6 failure is about. I'm not able to reproduce it locally. |
What's the error you are seeing? We noticed that there's a bug in h5py version 3.7 with fillvalue handling which is fixed by this PR: h5py/h5py#2111 |
The errors are on the CI builds here. h5py 3.6 is giving this error (which I cannot reproduce locally):
h5py 3.7 is segfaulting (which I can reproduce)
|
Looks like that might be the same issue but I'll check to be sure. |
Hi @asmeurer: did you get to the bottom of those segfaults? |
I think the issues are upstream h5py problems, so we should be OK to release. I wasn't able to take a look at them. |
Using VirtualLayout and VirtualSource directly is slow for a few reasons. The
main reason is that VirtualSource does a deepcopy of itself every time you
slice it. The high-level code also calls the internal helper
_convert_space_for_key(), which is unnecessary for our given slices (it only
does anything for unbounded slices), and recreates a selection object each
time, which we can do just once.
Any further performance improvements here would have to come from speeding up
the h5py select() function, which is where most of the time is now spent. I have not yet investigated if there are any obvious gains that can be made here.
I am still investigating if any of these improvements can be upstreamed to h5py itself so that these workarounds aren't necessary and we can just use the normal high-level APIs.
This takes the example in #226 from 3.66 seconds to 2.77 seconds on my machine. In h5py 2.10, the example takes 2.52 seconds via the old optimizations.