[yt-4.0] optimize accessing data through ds.all_data() for SPH #2146
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR improves the performance of scripts like this one:
Before this PR, on my mac laptop with an SSD, this script prints 1.6224758625030518. After it prints 0.17824316024780273, so about a 10x improvement by skipping a lot of unnecessary calls to the selection machinery. We're still not as fast as calling h5py directly because the chunking system needs to first load in the data and then copy it into a result array, so there are some extra copies and array allocations that don't happen for the pure h5py case.
This optimization was requested by @saethlin on slack a few days ago. It would be great if we could get some testing on this from users of yt-4.0 besides Ben, perhaps @qobilidop or @chummels? I would also appreciate review by @matthewturk, particularly my change to the
RegionSelector
. It might also be nice to refactor things to avoid some of the copy/paste across the frontends.