-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: reading speed optimisation for AMRVAC frontend #3508
Changes from all commits
846e250
8feeca0
ae63d0e
d202029
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -151,17 +151,12 @@ def get_single_block_data(istream, byte_offset, block_shape): | |
|
||
def get_single_block_field_data(istream, byte_offset, block_shape, field_idx): | ||
"""retrieve a specific block (ONE field) from a datfile""" | ||
istream.seek(byte_offset) | ||
|
||
# compute byte size of a single field | ||
field_shape = block_shape[:-1] | ||
fmt = ALIGN + np.prod(field_shape) * "d" | ||
byte_size_field = struct.calcsize(fmt) | ||
|
||
# Read actual data | ||
istream.seek(byte_size_field * field_idx, 1) # seek forward | ||
d = struct.unpack(fmt, istream.read(struct.calcsize(fmt))) | ||
|
||
# Fortran ordering | ||
block_field_data = np.reshape(d, field_shape, order="F") | ||
return block_field_data | ||
istream.seek(byte_offset + byte_size_field * field_idx) | ||
data = np.fromfile(istream, "=f8", count=np.prod(field_shape)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are you sure you want There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This module was adapted from a python script that was meant to be manually edited by users, so the endianness is actually hardcoded globally as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point.. |
||
data.shape = field_shape[::-1] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One trick we've used for hdf5 files has been to create a destination array and read directly into that. To be honest this did make me stop and think what I was seeing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm also not super happy with how this looks. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also couldn't find a good one. All I could come up with was doing some kind of |
||
return data.T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reducing the number of seeks is really helpful. It might be possible to sort the calls to
get_single_block_of_data
byoffset
andfield_idx
(even sorting by that tuple should work) to make sure we're reading them in the right order, too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like a good idea, I'll try that !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was able to sort by field but it may not have a huge impact because of how AMRVAC dump files represent data (fields are on the inner loop of the writing routine, individual fields are never contiguous). I've tried sorting grids too and got no significant gain. The dataset I'm using to benchmark this also happens to use a single "data chunk", so there's no way to measure if sorting chunks would help. In conclusion I don't think there's anything I can do here.