HDF5 MPI-collective I/O. #309

1uc · 2023-11-10T16:02:27Z

This PR contains the code required to implement the MPI-collective reader. This will likely later be moved to it's own repository.

We need the ability to read general hyperslabs into a 2D array.

The point is to split reading of the dataset into a separate function, and then make `resolve` safe for collective IO (assuming the newly introduced function is). The overload for reading a single `nodeID` is removed as it's unused now.

This commit refactors `_readSelection` in such a manner that: 1. Canonical selections don't require post-read shuffling. 2. The reading of the dataset is moved to a separate function. As a consequence, there's no need for an optimization for `std::string`, since those only mattern when reading from the `"@library"`, which should only happen indirectly. Hence, we free to ensure that those are always canonical.

This commit introduces the API for an Hdf5Reader. This reader abstracts the process of opening HDF5 files, and reading an `libsonata.Selection` from a dataset. The default reader calls the existing `_readSelection`.

This reader implements MPI-collective reading of datasets.

1uc · 2023-12-06T12:51:25Z

Has been moved to a separate repo.

## Context When using `WholeCell` load-balancing, the access pattern when reading parameters during synapse creation is extremely poor and is the main reason why we see long (10+ minutes) periods of severe performance degradation of our parallel filesystem when running slightly larger simulations on BB5. Using Darshan and several PoCs we established that the time required to read these parameters can be reduced by more than 8x and IOps can be reduced by over 1000x when using collective MPI-IO. Moreover, the "waiters" where reduced substantially as well. See BBPBGLIB-1070. Following those finding we concluded that neurodamus would need to use collective MPI-IO in the future. We've implemented most of the required changes directly in libsonata allowing others to benefit from the same optimizations should the need arise. See, BlueBrain/libsonata#309 BlueBrain/libsonata#307 and preparatory work: BlueBrain/libsonata#315 BlueBrain/libsonata#314 BlueBrain/libsonata#298 By instrumenting two simulations (SSCX and reduced MMB) we concluded that neurodamus was almost collective. However, certain attributes where read in different order on different MPI ranks. Maybe due to salting hashes differently on different MPI ranks. ## Scope This PR enables neurodamus to use collective IO for the simulation described above. ## Testing  We successfully ran the reduced MMB simulation, but since SSCX hasn't been converted to SONATA, we can't run that simulation. ## Review * [x] PR description is complete * [x] Coding style (imports, function length, New functions, classes or files) are good * [ ] Unit/Scientific test added * [ ] Updated Readme, in-code, developer documentation --------- Co-authored-by: Luc Grosheintz <luc.grosheintz@gmail.ch>

1uc force-pushed the 1uc/hdf5-collective-reader branch from 248f364 to d6bf8a9 Compare November 13, 2023 16:35

1uc added 8 commits November 14, 2023 09:58

No 'brew update'.

dccc395

Update HighFive to v2.8.0.

5b803d1

We need the ability to read general hyperslabs into a 2D array.

Refactor edge_index::resolve.

be7ba48

The point is to split reading of the dataset into a separate function, and then make `resolve` safe for collective IO (assuming the newly introduced function is). The overload for reading a single `nodeID` is removed as it's unused now.

Implement Hdf5Reader API and default.

261a317

This commit introduces the API for an Hdf5Reader. This reader abstracts the process of opening HDF5 files, and reading an `libsonata.Selection` from a dataset. The default reader calls the existing `_readSelection`.

Implement collective Hdf5Reader.

cc56c89

This reader implements MPI-collective reading of datasets.

fixup: collective hdf5

01f1045

fixup: make collective reader more independent.

8695431

1uc force-pushed the 1uc/hdf5-collective-reader branch from d6bf8a9 to 8695431 Compare November 14, 2023 13:10

1uc mentioned this pull request Nov 23, 2023

Read synapse parameters in a collective safe manner. BlueBrain/neurodamus#85

Merged

4 tasks

1uc closed this Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDF5 MPI-collective I/O. #309

HDF5 MPI-collective I/O. #309

1uc commented Nov 10, 2023

1uc commented Dec 6, 2023

HDF5 MPI-collective I/O. #309

HDF5 MPI-collective I/O. #309

Conversation

1uc commented Nov 10, 2023

1uc commented Dec 6, 2023