Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce celer-sim memory usage and improve its CELER_LOG transport output #1550

Merged
merged 13 commits into from
Jan 6, 2025

Conversation

stognini
Copy link
Member

@stognini stognini commented Dec 20, 2024

For large simulations the SimulationResult::events (which is a std::vector<TransporterResult>) continuously grows in-memory, being flushed to disk only at the end of celer-sim. This can lead to huge memory usage---in one test, celer-sim reached >25 GB. Another effect that slows down the run and clutters the I/O is the CELER_LOG message issued at every new event. To address these, this PR adds two options to RunnerInput:

  • New bool transporter_result, which only accumulates event data on request.
  • New size_type print_progress, which if set to N (N>0), will only issue the CELER_LOG message after N events.

@stognini stognini requested a review from sethrj December 20, 2024 17:58
@stognini stognini added enhancement New feature or request app Application front ends labels Dec 20, 2024
Copy link

github-actions bot commented Dec 20, 2024

Test summary

 4 215 files   6 481 suites   15m 2s ⏱️
 1 669 tests  1 663 ✅  6 💤 0 ❌
22 844 runs  22 764 ✅ 80 💤 0 ❌

Results for commit 013edc0.

♻️ This comment has been updated with latest results.

@stognini stognini requested a review from amandalund December 20, 2024 19:39
Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought: we definitely don't want people with 20GB memory usage by accident 😬

app/celer-sim/Transporter.hh Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
app/celer-sim/RunnerInputIO.json.cc Outdated Show resolved Hide resolved
app/celer-sim/celer-sim.cc Outdated Show resolved Hide resolved
@pcanal
Copy link
Contributor

pcanal commented Dec 20, 2024

being flushed to disk only at the end of celer-sim.

Why do we not flush it every events? (I.e. store in the result in a TTree/RNTuple, this seems to be the typical use case for that)?

@stognini
Copy link
Member Author

stognini commented Dec 21, 2024

Why do we not flush it every events? (I.e. store in the result in a TTree/RNTuple, this seems to be the typical use case for that)?

Oh, the ROOT output is fine. This is dumped at the end via JsonPimpl, at celer-sim.cc::274

output->output(&cout);

and is necessary if CELERITAS_USE_ROOT=OFF.

app/celer-sim/celer-sim.cc Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
app/celer-sim/Transporter.cc Outdated Show resolved Hide resolved
Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last change I think :)

app/celer-sim/celer-sim.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@amandalund amandalund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more from me too :) thanks @stognini!

app/celer-sim/celer-sim.cc Outdated Show resolved Hide resolved
@pcanal
Copy link
Contributor

pcanal commented Dec 23, 2024

Oh, the ROOT output is fine. This is dumped at the end via JsonPimpl, at celer-sim.cc::274

So turning on ROOT 'solves' the problem. The issue is only if the output is JSON. correct? If it is correct, do we really store 25GB in a text file !? (humm okay that was the total memory, was it the file size then?)

@stognini
Copy link
Member Author

So turning on ROOT 'solves' the problem. The issue is only if the output is JSON. correct? If it is correct, do we really store 25GB in a text file !?

Using ROOT wouldn't solve it, since we'd end up having both ROOT and json outputs. So I proposed the option to minimize the json info, which is also useful if you just want to time the execution.

I did not check how large a final file would have been since I did not worry about directing the cout to a file. But when the 25-ish GB run finished, it took probably 10+ minutes to flush the result to the terminal...

@pcanal
Copy link
Contributor

pcanal commented Dec 23, 2024

, it took probably 10+ minutes to flush the result to the terminal...

Yikes :(

Using ROOT wouldn't solve it, ...So I proposed the option to minimize the json info

Indeed text based information should be minimized. What kind of information is in the json file? What information is being dropped? What information is being kept? How much really needs to be in a file that is directly human? What is the maximum sizes where the JSON is still actually readable? (few MB?). How much of the information that is not strictly needed in the human readable file can be stored in (yet another) ROOT file?
(sorry if there is already answers to some of those questions in other places)

@sethrj
Copy link
Member

sethrj commented Dec 24, 2024

@pcanal This is the "diagnostic" JSON output which wasn't meant to have much data at all. It's primarily meant to help us analyze performance data and do simple postprocessing. Because it's performance related, we want it to be easy/automatic to stand up, which rules ROOT out. For output that scales with increasing problem difficulty (e.g. the transport results) maybe we should hardcode/use a backdoor environment variable for the maximum number of iterations before we stop saving and writing.

@amandalund
Copy link
Contributor

@stognini out of curiosity, how many events were you running and was this with both write_track_counts and write_step_times off?

One final modification you could make in the Transporter is to only allocate memory for these guys if store_track_counts_ is true.

@stognini
Copy link
Member Author

@stognini out of curiosity, how many events were you running and was this with both write_track_counts and write_step_times off?

If I recall correctly that was a 10M events, 1 primary each. And yes both of those options were off. I turned off all json options I could, related to writing info.

@pcanal
Copy link
Contributor

pcanal commented Dec 26, 2024

Because it's performance related, we want it to be easy/automatic to stand up, which rules ROOT out.

humm ... 'easy/automatic' is clearly doable in ROOT (delta the learning curve). Things that a human readable (json) format would give you are ability to grep through random-ish text, quick visual inspection (vi/less) [albeit if was remembers the structure RDataFrame::Display might work as well). and web display (github actions). Most of which disappear if the file is too large.

Example of 'quick' textual display of a ROOT file content:

root.exe -b -l tutorials/hsimple.root -e 'ROOT::RDataFrame df(*ntuple); df.Range(100,200).Display({"px","py"},100)->Print();' -q | less

See also RDataFrame's user manual and documentation. (This is a bit more typing but much clearer/flexible that the previous TTree::Scan version).

maybe we should hardcode/use a backdoor environment variable for the maximum number of iterations before we stop saving and writing.

This makes some sense. (Although the next request is also a start (i.e. capture from the 100th events to the 150th events instead of the first 50 events).

@sethrj
Copy link
Member

sethrj commented Dec 26, 2024

I mean easy for other people to install, or (like our external) automatic.

@sethrj
Copy link
Member

sethrj commented Jan 2, 2025

@stognini @pcanal I think we got all the major kinks worked out, please take a look at the discussions and resolve as needed.

@stognini
Copy link
Member Author

stognini commented Jan 3, 2025

Thought I answered this thread, just to notice that I probably never hit the comment button during the holidays:

Example of 'quick' textual display of a ROOT file content

For the celer-sim I ruled out any textual output that we get from ROOT, despite it having some nice features (such as TBufferJSON::ConvertToJSON) to avoid making it a necessary dependency for I/O, since we want to be able to test it on any system (e.g. summit's powerpc was a limitation).

@amandalund
Copy link
Contributor

@stognini one last update that would also slightly improve the memory usage:

One final modification you could make in the Transporter is to only allocate memory for these guys if store_track_counts_ is true.

@sethrj sethrj enabled auto-merge (squash) January 6, 2025 20:35
@sethrj sethrj merged commit d743f0b into celeritas-project:develop Jan 6, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
app Application front ends enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants