-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to limit number of BF estimators #47
Conversation
A few things to do before we can merge:
|
Hi Luke, I have made a few of the changes you have suggested. The code, for the most part, appears ready to merge into the main branch; once I have implemented the size_t changes, I agree some performance testing should be done. However, I have a question. For some reason, the code seemed to fall over on one of the Kilonova tests. However, when I reran the tests, it passed without causing any errors. Is this behaviour expected? |
We need to add a new test for this functionality to ensure that it doesn't get broken in the future. I suggest just making a slightly modified version of the nebularonezone test. Do you want me to do that part or do you want to try it? |
Yes, I have seen this before on the kilonova test. I'm not sure exactly how it happens, but I suspect that we occasionally get assigned a runner with a slightly different CPU or software environment and this changes the checksums. Re-running makes sense in that situation. |
I'm happy to try to make a new nebularonezone test that tests the functionality, as it's probably valuable for me to be a bit more competent at the code development side of things. |
Ok, great. It's currently only in the develop branch, but take a look at how kilonova_2d_2dgrid_expansionopac is derived from kilonova_2d_2dgrid for an example. |
- Add VPKT_WRITE_CONTRIBS option to save all virtual packets contributions for later time and wavelength binning, with emission/absorption types by artistools. - Improve performance by enabling fast-math compiler options, constexpr radfield bins, improving bound-free estimator accumulation, removing storage for empty cells, and using node shared memory for initmassfracstable, elem_meanweight. - 3D AD2 (Shingles et al. 2023) 1e8pkt 1920 core JUWELS runtime decreased from 210k core hrs to 134k core hrs (1.56x faster). Memory per core decreased from 2.8GB to 1.5GB. - W7 (Shingles et al. 2022) nebular 960 core 2e9pkt 150-410d JUWELS runtime decreased from 13.6k core hrs (20230811 version) to 10.4k core hrs (1.31x faster). Compared to the 2022 published version with 18.6k core hrs, this is 1.79x faster. - With a Simpson rule integrator in place of GSL's adaptive method (producing identical observables) runtime decreases to 6.83k core hours, 1.99x faster than 20230811 and 2.72x faster than the 2022 published version. - subMch Shen+2018 nebular 960 core 1e9pkt 150-410d runtime decreased from 13.14k core hrs (20210417 version) to 7.30 kCore hours (1.80x faster). With Simpson rule integrator, runtime decreased to 6.07 k core hrs (2.16x faster) - sim2010 classic mode 2-120 days 2e7 960 core JUWELS runtime decreased from 3.81k core hrs (20230526 version) to 1.87k core hrs (2.04x faster). Compared to ARTIS classic with 11.45k core hrs, this version is 6.13x faster. - Reduce memory per core to enable future GPU support. All threads now share a single cell cache and operate on packets within the same cell. This may later be expanded to multiple cell caches to increase thread occupancy as memory allows. - Fix virtual packets for models with internal empty cells. Passing into an empty cell previously triggered instant escape. - Fix a 1e5 factor in free-free opacity for opacity case 5 (Tanaka Ye-dependent grey mode and line-by-line non-grey mode for kilonovae). This increased the 3D AD2 (Shingles et al. 2023) early luminosity by about 20%. The spectra are almost unchanged (except for being brighter) - Reduce noise in particle deposition rate estimators by contributing partial deposition of packet energy prior to Monte Carlo transition event (similar to gamma deposition rates from path estimators). - Add support for calculating and using expansion opacities. - Add option for #52 by Gerrit Leck - #47 by Josh Pollin - Add some fixes to hybrid LTE-NLTE mode by Christine Collins --------- Co-authored-by: gleck97 <86471143+gleck97@users.noreply.github.com> Co-authored-by: Gerrit Leck <gleck@lxbuild07.gsi.de> Co-authored-by: Josh Pollin <joshuapollin222@gmail.com> Co-authored-by: Christine Collins <ccollins22@qub.ac.uk>
No description provided.