Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize reading LORs #23

Open
jacg opened this issue Jul 15, 2022 · 2 comments
Open

Parallelize reading LORs #23

jacg opened this issue Jul 15, 2022 · 2 comments

Comments

@jacg
Copy link
Owner

jacg commented Jul 15, 2022

This is the item which was not addressed in the work that closed #18.

After #18, loading the LORs from file takes about 25% of the total reconstruction time on frontend1petalo with unlimited core use. Using 25 threads, it takes around 15% of the total time.

Seems unlikely that we'll be able to do much about it, unless the hdf5 crate provides something that helps.

Overall, I suspect that further optimizations with the current design are probably not worth the effort. More significant gains will probably come from switching to data-oriented design (#19) and Apache Arrow/Parquet.

@jacg
Copy link
Owner Author

jacg commented Jul 15, 2022

Currently we take thousands of files produced by MC jobs and reconstruct LORs into a single, enormous LORs file. Reading LORs from this single file is the bottleneck.

Reconstructing LORs into fewer, separate files, would probably make the task of parallelizing LOR reading very easy.

@jacg
Copy link
Owner Author

jacg commented May 11, 2023

The Rust HDF5 crate uses a version of the underlying C library that contains a global lock, so we're limited to reading only one HDF5 file per process. Parallel IO in HDF5 seems to require OpenMP. Yet another reason to replace HDF5 with some 21st century technology such as Arrow/Parquet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant