-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement optimized support for vector I/O in Subfiling VFD #3896
Conversation
Vector I/O requests are now processed within a single set of I/O call batches, rather than each I/O vector entry (tuple constructed from the types, addrs, sizes and bufs arrays) being processed individually. This allows I/O to be more efficiently parallelized among the I/O concentrator processes during large I/O requests.
3f60cb9
to
dac72d3
Compare
src/H5FDsubfiling/H5FDioc.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file mostly just add the size-extending feature for vector I/O to the IOC VFD. The types
parameter is not used here, so that extending feature wasn't added.
src/H5FDsubfiling/H5FDioc_int.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that vector I/O is supported, vector I/O requests could get passed directly to the IOC VFD if there is only 1 subfile since it doesn't make since to split up the I/O requests across subfiles in that case. The change in this file addresses a new problem that came up with the new support for vector I/O.
src/H5FDsubfiling/H5FDioc_threads.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about addressing vector I/O requests getting passed directly down to the IOC VFD
@@ -40,7 +40,7 @@ | |||
#define PATH_MAX 4096 | |||
#endif | |||
|
|||
#define DEFAULT_DEFLATE_LEVEL 9 | |||
#define DEFAULT_DEFLATE_LEVEL 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the compression level from 9 to something a bit less aggressive since the tests were spending ~60% of the time just doing compression and that isn't the main focus for the tests.
@@ -2360,11 +2360,33 @@ main(int argc, char **argv) | |||
if (MAINPROCESS) | |||
puts(""); | |||
|
|||
if (MAINPROCESS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move the block that sets compression up so that the tests run once with compression before setting environment variables and once after setting them. Previously, the compression tests were running twice only after the environment variables had been set.
assert(buf); | ||
|
||
/* Check for overflow conditions */ | ||
if (!H5_addr_defined(addr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the logic between read and write was mostly duplicated, so I moved it into H5FD__subfiling_io_helper
if (file_ptr->fa.require_ioc) { | ||
|
||
bool extend_sizes = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this logic is now moved into the part of H5FD__subfiling_io_helper
that handles the vector I/O requests.
* Generate the types, addrs, sizes and bufs I/O vectors for | ||
* this I/O request. | ||
*/ | ||
status = generate_io_vectors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old init_indep_io
function has been replaced by generate_io_vectors
, which can now handle translating more than one (offset, I/O size, type, buffer) tuple into a set of I/O vectors that spans the subfiles. The new function is also a bit more efficient in that it directly generates the I/O vectors rather than generating lists of offset, size, types, buffer arrays that are used to populate I/O vectors later like before.
*------------------------------------------------------------------------- | ||
*/ | ||
static herr_t | ||
translate_io_req_to_iovec(subfiling_context_t *sf_context, size_t iovec_idx, size_t iovec_len, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much of the logic in this function is unchanged from the previous function it was in (init_indep_io
). The main difference is that the buffer indexing had to change since the I/O vectors used for the final I/O requests are populated directly here rather than generating arrays that the vectors would have later been populated with.
iovec_fill_first(subfiling_context_t *sf_context, int64_t iovec_depth, int64_t target_datasize, | ||
int64_t start_mem_offset, int64_t start_file_offset, int64_t first_io_len, | ||
int64_t *mem_offset_out, int64_t *target_file_offset_out, int64_t *io_block_len_out) | ||
iovec_fill_first(subfiling_context_t *sf_context, size_t iovec_len, int64_t cur_iovec_depth, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in the iovec_fill_
functions below are just changes to the buffer indexing for generating I/O vectors directly.
} | ||
|
||
H5_CHECK_OVERFLOW(size, size_t, int); | ||
if (MPI_SUCCESS != MPI_Bcast(bufs[i].vp, (int)size, MPI_BYTE, 0, file_ptr->comm)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not the focus of this PR but how hard would it be to accumulate these into a single bcast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this was mostly just copied over from the previous code, but it could probably be handled better with a derived type instead of multiple bcasts.
if (!extend_types) { | ||
if ((i > 0) && (types[i] == H5FD_MEM_NOLIST)) { | ||
extend_types = true; | ||
type = types[i - 1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems silly to use the type variable for type here when it can only be one thing, though I guess it makes the code more similar to other places. Not that it makes a huge difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I've just left this the same since it matches other places and was easier for me to reason about.
Done reviewing, looks good aside from my comments. We probably don't need to implement the more efficient rank 0 bcast right now but we should make a note of it. |
f20ddf9
to
0670176
Compare
@fortnern I believe I've addressed the most important comments from your review. I've added test cases that test for all the cases discussed in the comments; the newly-added max number of subfiles calculation was indeed problematic, but the test passes after your suggested change. The "thin uniform section" calculation wasn't exactly problematic, but I added a test for it anyway just to be sure. I haven't implemented the performance improvement for the rank0 bcast strategy yet, but I can add a comment to the code about that or document it somewhere else if you think that's better. |
I think we can just create an issue for the rank 0 bcast improvement for now |
…#3896) Vector I/O requests are now processed within a single set of I/O call batches, rather than each I/O vector entry (tuple constructed from the types, addrs, sizes and bufs arrays) being processed individually. This allows I/O to be more efficiently parallelized among the I/O concentrator processes during large I/O requests. * Fixed some calculations and add test cases for issues spotted from review * Removed a variable that was compensating for previous miscalculations
* Fix build error on freebsd (#3883) Fixes: checking for config freebsd12.1... no checking for config freebsd... found compiler '/home/svcpetsc/petsc-hash-pkgs/39f577/bin/mpicc' is GNU gcc-9.2.0 compiler '/home/svcpetsc/petsc-hash-pkgs/39f577/bin/mpif90' is GNU gfortran-9.2.0 stdout: .: cannot open ./config/classic-fflags: No such file or directory * Correct CMake command and example packaging (#3888) * Feat: Hashpin sensitive dependencies on GitHub Actions and enable Dependabot to update them monthly (#3892) * feat: hashpin sensitive dependencies on GHAs Signed-off-by: Diogo Teles Sant'Anna <diogoteles@google.com> * feat: enable dependabot for monthly updates on GHA Signed-off-by: Diogo Teles Sant'Anna <diogoteles@google.com> --------- Signed-off-by: Diogo Teles Sant'Anna <diogoteles@google.com> * Some changes to portal links when they could be found on docs.hdfgroup.org, and changed the helpdesk link to help.hdfgroup.org (#3893) * Updated some portal links to go directly to docs.hdfgroup. * Fixed some portal and help desk links * Add variable option syncing for examples (#3885) * Add period(.) at the end of the sentence for consistency. (#3897) * Remove redundant backslash character from comment. (#3899) * Disable doxygen as errors for netcdf (#3900) * disable building doxygen for netcdf test * Doc versions (#3903) * Added missing \since tags to H5D. * Committing clang-format changes * Fixed H5T version info. * Committing clang-format changes * Added missing version info to H5E. * Committing clang-format changes * Added version info to H5F public APIs. * Committing clang-format changes * Added missing H5Z public API version info. * Added missing version info to H5G public APIs * Added missing version info to H5I public API. * Added missing version info to H5 public APIs * Committing clang-format changes * Added missing version info to H5P public APIs * Added missing version info to H5R public APIs * Fix comment error. * Committing clang-format changes --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> * Change Trouble Shooting to Troubleshooting (#3905) * Implement optimized support for vector I/O in Subfiling VFD (#3896) Vector I/O requests are now processed within a single set of I/O call batches, rather than each I/O vector entry (tuple constructed from the types, addrs, sizes and bufs arrays) being processed individually. This allows I/O to be more efficiently parallelized among the I/O concentrator processes during large I/O requests. * Fixed some calculations and add test cases for issues spotted from review * Removed a variable that was compensating for previous miscalculations * Add 'warning density' computation to the warnhist script (#3910) * Add 'warning density' computation to the warnhist script, along with several cleanups to it. Add "--enable-show-all-warnings" configure (and CMake) option to disable compiler diagnostic suppression (and therefore show all the otherwise suppressed compiler diagnostics), disabled by default. Clean up a buncn of misc. warnings. Signed-off-by: Quincey Koziol <qkoziol@amazon.com> * Added H5Fdelete_f with test (#3912) * New Fortran Examples added (#3916) * added subfiling example * Added filtered writes with no selection example * Version and space corrections. * Restore H5_VERSION definition in configure.ac. * renamed defined H5_VERS* to avoid conflicts (#3926)
Vector I/O requests are now processed within a single set of I/O call batches, rather than each I/O vector entry (tuple constructed from the types, addrs, sizes and bufs arrays) being processed individually. This allows I/O to be more efficiently parallelized among the I/O concentrator processes during large I/O requests.