-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the I/O functionality use the podio provided functionality #69
Comments
Hi Thomas, thanks for this comprehensive issue. I think in the end it's simpler than the tables make it seem: the UserData functionality can completely replace what is used now for writing out vector etc. The only thing that I see missing on the podio side is something to allow the reader to ignore certain collections in the end store when writing as was done here with the KeepDropSwitch, but any implementation/interface for that is fine. |
Hi Valentin,
That could be achieved by only registering the collections that should be kept with the writer ( |
This should be done with #100 |
For historical reasons the I/O implementations of standalone podio and the one that is present here have diverged a bit and so now they over in principle the same but still slightly different functionality. I think the framework should use the podio facilities as far as possible, and I originally thought that this would be a somewhat mechanical but in the end straight forward thing to do. However, I have realized that there is a bit more work involved, so that I am recording my observations here. In the end, I think that changes to podio are also necessary and that it might be best to first stabilize the interfaces podio offers before we actually start to work on this here.
High level functionality differences
The following table gives an overview of the things were podio and k4FWCore differ in high level (i.e. user perceivable) functionality
vector
user dataUserDataCollection<T>
(sincev00-14
, compile time limitedT
)DataHandle<std::vector<T>>
(dictionary limitedT
, may fail silently(?) on I/O). There is #25 that might impact the usefulness of this feature(?)DataHandle<T>
(dedicated handling of ints and floats, but in principle again ROOT dictionary limited)I am not entirely sure how widespread the usage of these features is throughout the Key4hep components. Hence, it is also hard to gauge whether some of the functionality could be easily removed from k4FWCore (e.g. the possibility to store single int/float values per event). This is something that probably needs discussing.
Technicalities
In
k4FWCore
thePodioDataSvc
is handling the actual reading of the collections, it holds apodio::ROOTReader
and apodio::EventStore
as members that do the heavy lifting in this regard. Thek4DataSvc
is in essence a very thin wrapper around thePodioDataSvc
that exposes the filename(s) as property to be configured from the options file. ThePodioInput
algorithm is responsible for actually triggering the reading of the collections (that are specified as a property) in itsexecute
method. For this it just loops over the list of collections to read and callsPodioDataSvc::readCollection
. For writing collections there is thePodioOutput
algorithm, that basically re-implements the functionality of thepodio::ROOTWriter
. It holds aKeepDropSwitch
to control which collections to actually write to file. In all of this there are a few subtle differences between podio andk4FWCore
that make a "trivial" switch to podio facilities impossible. The following table provides a (probably incomplete) overview of them:EventStore::create
and then simply cleared in the event loop after writing.EventStore
inPodioDataSvc
never gets to know about themROOTWriter
, no outside accessPodioDataSvc
. Possible to get access to itROOTWriter::registerForWrite
collects a list of collection names to write. Checks inEventStore
if collections are actually available before adding them to the list. In event loop simply take this list and write (i.e. set branch addresses) and fill the event data tree.PodioOutput::execute
get the complete list of collections fromPodioDataSvc
and check via theKeepDropSwitch
which collections to write, before setting branch addresses and filling the event data tree.UserDataCollections
are handled the same as other collectionsDataHandle
creates necessary branches as it also has access to thePodioDataSvc
(and the event data tree therein). TheDataHandle
also makes sure to do the proper branch address re-setting.PodioOutput
writes the options file config into a separate branch of the meta data treeIReader
interface for reading. Separate writer implementations (with equal interfaces)IReader
interface should enable reading SIO out of the boxIn the end to get everything working the same and using the same facilities, some discussion is required to decide which functionality needs to be supported from podio, which functionality can be built on top of podio here, and most importantly how the interfaces have to look like to enable all this functionality.
The text was updated successfully, but these errors were encountered: