Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ROOT-based event exporter #900

Merged
merged 28 commits into from
Aug 24, 2023

Conversation

stognini
Copy link
Member

@stognini stognini commented Aug 17, 2023

This PR adds the option to export Celeritas' primaries to a ROOT file and read those primaries back into Celeritas.

  • RootOffloadWriter writes data to a ROOT file using basic types to avoid the use of dictionaries.
  • RootOffloadReader reads back the data in an event-by-event basis.

The reader can be used by a future class RootPrimaryGenerator final : public G4VPrimaryGenerator { };.

The use of a ROOT file considerably decreases the output file size, as expected. For the standard celer-g4 test using celeritas/app/data/ttbarbits-12evt-118prim.hepmc3:

1.9M  simple-cms.offloaded.hepmc3
357K  simple-cms.offloaded.root

- Missing PDG from writer; EventReader has to be updated for that
- Revert back to 1 root entry == 1 primary
- celer-g4 has test code that has to be cleaned
- Refactor class names
- Move to accel/detail
- Writer uses a vector<Primary> as input
- WIP: refactor reader to return vector<Primary> from ROOT
@stognini stognini requested review from sethrj and pcanal August 17, 2023 20:56
@stognini stognini added enhancement New feature or request io labels Aug 17, 2023
// See the top-level COPYRIGHT file for details.
// SPDX-License-Identifier: (Apache-2.0 OR MIT)
//---------------------------------------------------------------------------//
//! \file accel/detail/RootOffloadReader.hh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be celeritas/io/EventReader since we will need to use it from the celer-sim app.

*
* One TTree entry represents one primary.
*/
class RootOffloadWriter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise since this class doesn't depend on anything in accel and we can use it for writing out whatever primaries, put this in celeritas/io as RootEventWriter. The only thing that OffloadWriter did was to wrap a class for thread safety.

{
CELER_EXPECT(!primaries.empty());

std::scoped_lock{write_mutex_};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should remove this. The OffloadWriter is the only thing that needs to be thread safe. If we need thread safety outside what ROOT provides we can use the existing OffloadWriter and make its EventWriter a generic std::function<void(vector<Primary>)>. Also note that the way the reader is structured, you MUST lock out the file when writing an event, otherwise primaries from different events could be interleaved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure I got your point: so you're going to lock the operator() operation itself, whether that's a HepMC3 or ROOT writer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the OffloadWriter in accel has a single purpose: to wrap an EventWriter with a thread safe lock. I think we'll use that same mechanism to wrap the RootWriter, since for the reading to work, we have to write all tracks consecutively for a single event.

src/celeritas/ext/RootUniquePtr.root.cc Show resolved Hide resolved
src/celeritas/ext/RootUniquePtr.root.cc Outdated Show resolved Hide resolved
{
return root_offload_writer_;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete all this code; we should be using the same offload writer interface: either HepMC3 or ROOT, we don't need separate copies of both. I just wanted you to add the RootEvent{Reader,Writer} class and I was going to add in a separate PR the extra bit of std::functional and "make a different writer based on the output extension" code needed to hook it in.

- Refactor RootOffload{Writer, Reader} -> RootEvent{Writer, Reader}
- Move to celeritas/io
- Remove root offload option from SharedParams/SetupOptionMessenger

if (CELERITAS_USE_ROOT)
list(APPEND PRIVATE_DEPS ROOT::Tree)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be reverted since the ROOT code is no longer in accel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stognini You might have missed this comment earlier?

@@ -112,6 +112,8 @@ struct SetupOptions
std::string physics_output_file;
//! Filename to dump a HepMC3 copy of offloaded tracks as events
std::string offload_output_file;
//! Filename to dump a ROOT copy of offloaded tracks as events
std::string offload_root_output_file;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this; we'll reuse offload_output_file and switch format based on the file extension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this one?

Copy link
Member Author

@stognini stognini Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh crap. Thanks for noticing this. I checked out back a few files from a previous commit just to test a celer-g4 run and undid my fixes. I'll clean this up.

@@ -16,8 +16,7 @@ namespace celeritas
{
//---------------------------------------------------------------------------//
/*!
* Call `TObject->Write()` before deletion. Used by TFile and TTree writer
* classes.
* Call `TObject->Write()` before deletion.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a good place to mention that Write doesn't do what one would expect 😒

src/celeritas/ext/RootUniquePtr.root.cc Show resolved Hide resolved

// Set the registered ID of the particle
primary.particle_id = particle_id;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the use case for having an event reader that can't read the particle type. What can we do with primaries that are unknown particles? In all of the applications I can imagine, we'll be loading events into a simulation (or a test) that defines particle types already. I think the changes to EventReader.* should be reverted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stognini I think you missed this comment too.

src/celeritas/io/RootEventReader.cc Show resolved Hide resolved
*/
auto RootEventReader::operator()() -> result_type
{
CELER_EXPECT(entry_count_ <= num_entries_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another ScopedRootErrorHandler here too.

src/celeritas/io/RootEventReader.cc Outdated Show resolved Hide resolved
Comment on lines +74 to +98
//---------------------------------------------------------------------------//
/*!
* Helper function to fetch leaves.
*/
template<class T>
auto RootEventReader::from_leaf(char const* leaf_name) -> T
{
CELER_EXPECT(ttree_);
auto const leaf = ttree_->GetLeaf(leaf_name);
CELER_ASSERT(leaf);
return static_cast<T>(leaf->GetValue());
}

//---------------------------------------------------------------------------//
/*!
* Helper function to fetch leaves containing `std::array<double, 3>`.
*/
Real3 RootEventReader::from_array_leaf(char const* leaf_name)
{
CELER_EXPECT(ttree_);
auto const leaf = ttree_->GetLeaf(leaf_name);
CELER_ASSERT(leaf);
CELER_ASSERT(leaf->GetLen() == 3);
return {leaf->GetValue(0), leaf->GetValue(1), leaf->GetValue(2)};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how often we write a non-dictionary ROOT file it might be h

class RootLeafReader
{
explicit RootLeafReader(TTree* tree);

template<class T>
void operator()(char const* leaf_name, T* value) { /* default implementation */ }

template<class T, class N>
void from_leaf(char const* leaf_name, Array<T, N>* value) { /* call GetValue on each item */ }

template<class T>
void from_leaf(char const* leaf_name, OpaqueId<T>* value) {...}

so that you can make a local RootLeafReader read_leaf{ttree_.get()}; read_leaf("pos", p.pos); etc.

Probably not worth it for this small file but something to keep in mind.

@sethrj
Copy link
Member

sethrj commented Aug 19, 2023

@stognini I refactored the event IO test (originally used for HepMC3 only) so that it can be reused for testing the ROOT reader and writer. Please take a look at the output: I think we're missing (1) the check to ensure that all primaries being written at once have the same event ID and (2) resetting the track IDs when reading back in. In fact I think we can entirely delete the track IDs from the ROOT output since we have to reset them sequentially anyway.

@stognini
Copy link
Member Author

stognini commented Aug 21, 2023

I refactored the event IO test (originally used for HepMC3 only) so that it can be reused for testing the ROOT reader and write

Thanks!

In fact I think we can entirely delete the track IDs from the ROOT output since we have to reset them sequentially anyway.

Agreed. removed track_id branch from ROOT and manually set them during reading.

Now there is one detail that I am not sure how to proceed with the test: The event_id in EventWriter is set manually and sequentially, whereas the RootEventWriter initially simply assigned whatever event_id is stored in celeritas::Primary at the time, leading to the different results shown in the test. I have added an event id counter to the RootEventWriter to manually match the exact behavior of EventWriter. Let me know if that is the way you prefer, or if I should amend the test and keep the Root{reader/writer} as previously (i.e. simply accepting that whatever information in vector<Primary> is correct to begin with).

- Update RootEventWriter constructor to use a SPRootFileManager, as the
  RootFileManager needs to be initialized or have a default constructor.
- Fix missing bits when CELERITAS_USE_ROOT=OFF
@sethrj
Copy link
Member

sethrj commented Aug 22, 2023

The event_id in EventWriter is set manually and sequentially, whereas the RootEventWriter initially simply assigned whatever event_id is stored in celeritas::Primary at the time, leading to the different results shown in the test. I have added an event id counter to the RootEventWriter to manually match the exact behavior of EventWriter. Let me know if that is the way you prefer, or if I should amend the test and keep the Root{reader/writer} as previously (i.e. simply accepting that whatever information in vector is correct to begin with).

One big problem with the previous method is that interleaving the event IDs breaks the reader. Keeping the exact same behavior as the HepMC3 writer limits us, but it ensures no surprises when switching between offload_output_file = "foo.hepmc3"; and offload_output_file = "foo.root";


if (CELERITAS_USE_ROOT)
list(APPEND PRIVATE_DEPS ROOT::Tree)
endif()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stognini You might have missed this comment earlier?

@@ -112,6 +112,8 @@ struct SetupOptions
std::string physics_output_file;
//! Filename to dump a HepMC3 copy of offloaded tracks as events
std::string offload_output_file;
//! Filename to dump a ROOT copy of offloaded tracks as events
std::string offload_root_output_file;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this one?

src/celeritas/io/EventReader.hh Outdated Show resolved Hide resolved

// Set the registered ID of the particle
primary.particle_id = particle_id;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stognini I think you missed this comment too.

src/celeritas/io/RootEventReader.cc Outdated Show resolved Hide resolved
src/celeritas/io/RootEventReader.hh Outdated Show resolved Hide resolved
Copy link
Member

@sethrj sethrj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, sorry this took so long. I'll hook it up to accel next.

@sethrj sethrj merged commit 5e0e42c into celeritas-project:develop Aug 24, 2023
@sethrj sethrj changed the title Add capability to export Celeritas primaries as a ROOT file Add ROOT-based event exporter Aug 24, 2023
@stognini stognini deleted the dump-primaries branch August 24, 2023 13:16
@sethrj sethrj added external Dependencies and framework-oriented features and removed io labels Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request external Dependencies and framework-oriented features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants