Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TrkStrawHitMC provenance and guard against empty StrawDigiMCs #1291

Merged
merged 13 commits into from
Jul 1, 2024

Conversation

edcallaghan
Copy link
Contributor

This PR addresses fallout from PR #1245. In particular, that PR implemented functionality which can produce dummy StrawDigiMC objects which do not actually contain any useful information. Issue #1272 is a discussion place for structural replacements for that situation, but as it stands these dummy objects can be produced and must be accounted for in downstream analysis, which is partially addressed and partially enabled here. This PR enables the use of the SelectRecoMC module on events which contain StrawDigis from purely external origin.

Changes to existing data structures:
-TrkStrawHitMC: a provenance data member is added, in sync with and acting to the same effect as the newly-introduced provenance data member of StrawDigiMC. This allows to guard against inspecting the underlying information of a TrkStrawHitMC associated with a StrawDigiMC for which no meaningful truth information exists. This is not only desireable for correct analysis, but structurally necessary to short-circuit references of invalid pointers.

Changes to existing functionality:
-SelectRecoMC: Loops over StrawDigiMCs which may attempt to inspect the truth information contain therein now check the provenance of the StrawDigiMC and do not proceed with such inspection if the provenance is External. TrkStrawHitMC objects which are instantiated inherit their provenance from the preexisting StrawDigiMC.
-TrkMCTools::findMCTrk: A loop over StrawDigiMCs now checks the provenance field before inspecting truth information, as above. This is a situation where it appears that there one stl-vector is being constructed in sync with another, but such synchronization is not actually assumed or required anywhere.

@FNALbuild
Copy link
Collaborator

Hi @edcallaghan,
You have proposed changes to files in these packages:

  • MCDataProducts
  • CommonMC
  • TrkDiag

which require these tests: build.

@Mu2e/fnalbuild-users, @Mu2e/write have access to CI actions on main.

⌛ The following tests have been triggered for 5ceb93b: build (Build queue has 4 jobs)

About FNALbuild. Code review on Mu2e/Offline.

@FNALbuild
Copy link
Collaborator

☔ The build tests failed for 5ceb93b.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 5ceb93b at 684bac4
build (prof) Log file. Build time: 04 min 11 sec
ceSimReco Log file. Return Code 1.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
FIXME, TODO 🔶 TODO (1) FIXME (2) in 3 files
clang-tidy 🔶 0 errors 527 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 5ceb93b after being merged into the base branch at 684bac4.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@kutschke
Copy link
Collaborator

@edcallaghan Does this need to be tested with your PR in Production? or are the failures a different issue?

@kutschke kutschke self-assigned this Jun 25, 2024
@FNALbuild
Copy link
Collaborator

📝 The HEAD of main has changed to 825dc59. Tests are now out of date.

@edcallaghan
Copy link
Contributor Author

edcallaghan commented Jun 25, 2024

The failure is unrelated to the PR in Production. This has something to do with the root dictionaries (I think) for a struct which has been extended in this PR. I didn't see any equivalent related in my local testing, so I'll have to look into it.

Copy link
Collaborator

@brownd1978 brownd1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here is clear, I suggest a few minor cleanups.

CommonMC/src/SelectRecoMC_module.cc Outdated Show resolved Hide resolved
CommonMC/src/SelectRecoMC_module.cc Outdated Show resolved Hide resolved
CommonMC/src/SelectRecoMC_module.cc Outdated Show resolved Hide resolved
@@ -95,6 +104,7 @@ namespace mu2e {
float _wireTau; // threshold cluster distance to the wire along the perpedicular particle path
float _strawDOCA; // signed doca to straw
float _strawPhi; // cylindrical phi from -pi to pi with 0 in Z direction
TrkStrawHitProvenance _provenance; // TODO default read value == Sim?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support a default value, perhaps 'unknown' to flag errors downstream

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EnumToStringSparse class template requires that the detail class contain an enumerator named "unknown" and it default constructs it's objects to have a value of "unknown".

You will also add the implementation for the two functions declared in the detail class.

I suggest you moved the TrkStrawHitProvenance to be it's own .hh and .cc . Then use them in KalSeedMC .
If you prefer to keep them in KalSeedMC, then create KalSeedMC.cc to hold the missing implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, making the initialization manifest does make the code more readable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or add a comment that it defaults to unknown.

Copy link
Contributor Author

@edcallaghan edcallaghan Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A default constructor is now explicitly defined for the TrkStrawHitMC struct, which initializes the provenance to unknown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default provenance has been updated to Simulation, so that existing simulation will be deserialized correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kutschke The common DigiProvenance now inhabits a standalone src/inc pair.

Re default values: maybe I misunderstand your commentary, but the discussion here is about how to correctly initialize the provenance data member of TrkStrawHitMC when deserializing preexisting simulation in which the provenance field did not exist. Since the default value for the actual provenance enum will be unknown, as you asy, this needs to be explicitly overridden in the TrkStrawHitMC constructor to mark the preexisting simulation as having been produced as such.

// if mc info is not meaningful, then skip this digi.
// this looks sketchy, but nowhere is an implicit association
// between the sct and mcdigis collection actually assumed
if (mcdigi.provenance() == StrawDigiProvenance::External){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restructure logic to avoid 'continue'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

  - change default provenance to Simulation, to accomodate presimulated data
  - finish implementation of associated provenance class
  - squash SimParticle index bound calculation in SelectRecoMC
Copy link
Collaborator

@kutschke kutschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few followups

@@ -95,6 +104,7 @@ namespace mu2e {
float _wireTau; // threshold cluster distance to the wire along the perpedicular particle path
float _strawDOCA; // signed doca to straw
float _strawPhi; // cylindrical phi from -pi to pi with 0 in Z direction
TrkStrawHitProvenance _provenance; // TODO default read value == Sim?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EnumToStringSparse class template requires that the detail class contain an enumerator named "unknown" and it default constructs it's objects to have a value of "unknown".

You will also add the implementation for the two functions declared in the detail class.

I suggest you moved the TrkStrawHitProvenance to be it's own .hh and .cc . Then use them in KalSeedMC .
If you prefer to keep them in KalSeedMC, then create KalSeedMC.cc to hold the missing implementation.

CommonMC/src/SelectRecoMC_module.cc Outdated Show resolved Hide resolved
@@ -95,6 +104,7 @@ namespace mu2e {
float _wireTau; // threshold cluster distance to the wire along the perpedicular particle path
float _strawDOCA; // signed doca to straw
float _strawPhi; // cylindrical phi from -pi to pi with 0 in Z direction
TrkStrawHitProvenance _provenance; // TODO default read value == Sim?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, making the initialization manifest does make the code more readable.

@@ -95,6 +104,7 @@ namespace mu2e {
float _wireTau; // threshold cluster distance to the wire along the perpedicular particle path
float _strawDOCA; // signed doca to straw
float _strawPhi; // cylindrical phi from -pi to pi with 0 in Z direction
TrkStrawHitProvenance _provenance; // TODO default read value == Sim?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or add a comment that it defaults to unknown.

@@ -95,6 +105,7 @@ namespace mu2e {
float _wireTau; // threshold cluster distance to the wire along the perpedicular particle path
float _strawDOCA; // signed doca to straw
float _strawPhi; // cylindrical phi from -pi to pi with 0 in Z direction
TrkStrawHitProvenance _provenance; // origin/validity of MC info object
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's possible to do the default initialization here rather than in the c'tor initializer list, that would be better. I don't know if that's possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the default values have to be designated in the constructor (maybe in the definition for argument-supplied fields, but definitely in the initializer list otherwise).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I was getting at is that for sufficiently simple types, you can initialize it in the same line as the declaration

_provenance = (static_cast(DigiProvenanceDetail::Simulation);

Then it is implicitly present in the initializer list of all c'tors. What I am not sure about is whether this type is simple enough. I bet it will be if you use the typedef I suggested in my recent review.

CommonMC/src/SelectRecoMC_module.cc Outdated Show resolved Hide resolved
@edcallaghan
Copy link
Contributor Author

@kutschke Sorry, I may have been using a stale page yesterday, which is why I seemed to have ignored your comments! I agree with you that the multiple provenance classes is starting to look like the wrong pattern (and we haven't even gotten beyond tracker yet), so I've consolidated into a single DigiProvenance class. There's an argument that this "sounds" wrong in the TrkStrawHitMC context, but the reality is that it is a one-to-one pairing. @brownd1978 chatted a bit about how this will look outside of the tracker, and I think a good option is that we recommend the DigiProvenance class be a common structure for this book-keeping, shared between subsystems.

Copy link
Collaborator

@brownd1978 brownd1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the cleanups the structure now looks really good. I have one last suggestion, then I think this is good for merging.

TrkDiag/src/TrkMCTools.cc Outdated Show resolved Hide resolved
Copy link
Collaborator

@kutschke kutschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes. It's great that you and Dave are thinking ahead to the other subsystems - I had not got that far yet.

I just noticed on more thing - see the inline comment. Then i think we are done.

MCDataProducts/inc/StrawDigiMC.hh Outdated Show resolved Hide resolved
Copy link
Collaborator

@brownd1978 brownd1978 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the requested changes, this looks good.

@edcallaghan edcallaghan reopened this Jun 27, 2024
@edcallaghan
Copy link
Contributor Author

FYI, I snuck in one extra provenance check to make the standard Digitize production workflow compatible with mixed samples.

@edcallaghan
Copy link
Contributor Author

...and also a higher-level but similar-purpose interface KalSeedMC::ContainsSimulation which is necessary to iron out downstream issues in TrkAna. I do not foresee adding any more functionality to this PR.

};
using StringedDigiProvenance = EnumToStringSparse<DigiProvenanceDetail>;

class DigiProvenance: public StringedDigiProvenance{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three things. First, EnumToStringSparse does not have a virtual d'tor so we should not inherit from it. That's easy to fix, we can give it a virtual d'tor and also add the other 4 rule of 5 functions as =default. ( My style is if the compiler will do the right thing for all rule of 5 functions I just leave them out and add a one line comment to that effect - I see that EnumToStringSparse is so old that the rule of 5 was then the rule of 3.) Go ahead and edit EnumToStringSparse.

The second issue is the CSAID C++ experts strongly advise against using inheritance when containment will do. We do violate this in some places. I think we are OK here.

Third, I think that if you add
typedef StringedDigiProvenance::enum_type enum_type;

(or try the equivalent using would be better ).

Then downstream code can avoid needing the DigiProvenanceDetail:: and you may also be able to avoid the static casts. I am not 100% sure that I have all details right here but I am pretty sure I am close.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After sleeping on it, there is an different solution that avoids introducing the third type and avoids issue with inheritance, just use a free function:

using DigiProvenance = EnumToStringSparse<DigiProvenanceDetail>;

bool containsSimulation(  DigiProvenance::enum_type  id ){
   return (  id == DigiProvenance::Simulation || id == DigiProvenance::Mixed );
}

Copy link
Contributor Author

@edcallaghan edcallaghan Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The free-function approach appears to be the path of least resistance --- it is preferable, IMO, to the extra baggage needed to make another distinct class pass muster. (FWIW, I went with inheritance over containment here because otherwise there are extra implementations to add to the list (i.e. forwarding all of the member functions of EnumToStringSparse<>)).

That being said, I'm seeing issues with serialization of that simpler implementation (using DigiProvenance = EnumToStringSparse<DigiProvenanceDetail>) manifesting as data corruption after deserializing, in which the actual enum_type not being set correctly. Since this should be simpler the scenario, I'm not really sure what's going wrong. I'll have another try at figuring it out, but if I can't then I may need to just push updates to EnumToStringSparse<> to make it "inheritable."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow response.

Can you point me to your working code that is failing and tell me what the error message is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I have the changes in a branch at https://github.com/edcallaghan/mu2e-Offline/tree/update_TrkStrawHitMC_provenance_free_functions. If you compile that (should work at head) and run the contents of ~ejc3/public/DigiProvenance, you can reproduce for yourself. (This is a pair of jobs exemplified in .../run.sh, in which we first physics-mix'' [pm] a few NoPrimary [np] events, and then digi-mix'' [dm] a few CEs into them [ce]). You can see the error in .../dm-ce.log, the relevant bit pasted here:

---- EventProcessorFailure BEGIN
  EventProcessor: an exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    Path: ProcessingStopped.
    ---- ProductNotFound BEGIN
      A request to resolve an Ptr to a product containing items of type: mu2e::StrawGasStep with ProductID 2189702261
      cannot be satisfied because the product cannot be found.
      The productGetter was not set -- are you trying to dereference a Ptr during mixing?
      The above exception was thrown while processing module StrawDigiMCFilter/Triggerable run: 1202 subRun: 0 event: 1
    ---- ProductNotFound END
    Exception going through path TriggerablePath
  ---- ScheduleExecutionFailure END
---- EventProcessorFailure END

But: my statement that the problem is in the serialization of the provenance was confused and, I think, incorrect. After another round of less-sloppy testing, the problem presents with the current version in this PR as well.

I suspect that the actual problem is with the construction of a new StrawDigiMC at https://github.com/Mu2e/Offline/blob/main/TrackerMC/src/StrawDigiBundleCollection.cc#L223, ultimately stemming from the same "scope constraint" on use of art data products as I mention in #1290. That has nothing to do with the implementation of DigiProvenance, so I can push the simpler implementation for that here. If I'm right, then another PR will need to be opened which addresses the invalid Ptrs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking at it now.

Just a wild guess and it's from memory. If your are making two data products, ACollection and BCollection, both in the same module, and if one contains Ptrs into the other, you cannot dereference the Ptrs until the module has returned. So they only work in downstream modules. The reason is that the pointee data product does not exist until after the return ( the move and the return are like a database two phase commit ).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make dm-ce.fcl readable by me.

ls -l ~ejc3/public/DigiProvenance/dm-ce.fcl
-rw-------+ 1 ejc3 nobody 4177 Jun 30 15:46 /nashome/e/ejc3/public/DigiProvenance/dm-ce.fcl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like my wild guess is irrelevant The changes to Filters/src/StrawDigiMCFilter_module.cc look simple enough. So I agree it's likely that the bug is earlier in the workflow.

That's for making the change to the free function and fixing the typo. My reason for suggesting the free function was the same as in your reply.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, ~ejc3/public/DigiProvenance/dm-ce.fcl is now public-readable.

But, that being said, I think the issue here is actually much dumber than what I was getting at. When "digi mixing," I am mixing in the StrawDigiMCs, each of which contains an art::Ptr<SimParticle> into a SimParticleCollection. But I don't actually mix the SimParticleCollection in, so of course the pointer can't be dereferenced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all of the fixes in now?


bool DigiProvenance::ContainsSimulation() const{
DigiProvenanceDetail::enum_type id = this->id();
bool rv = ((id == DigiProvenanceDetail::Simulation) || (id == DigiProvenanceDetail::Simulation));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a typo? Should one of them be Mixed?

Copy link
Contributor Author

@edcallaghan edcallaghan Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and yes --- thank you for catching this!

MCDataProducts/inc/StrawDigiMC.hh Outdated Show resolved Hide resolved
@kutschke
Copy link
Collaborator

... and I like that you added ContainsSimulation() - it should reduce pilot error.

… and revert subclassing of EnumToStringSparse<>
@kutschke
Copy link
Collaborator

kutschke commented Jul 1, 2024

@FNALbuild run build test

@FNALbuild
Copy link
Collaborator

⌛ The following tests have been triggered for 4ead963: build (Build queue has 4 jobs)

@edcallaghan
Copy link
Contributor Author

edcallaghan commented Jul 1, 2024

@kutschke To checkpoint the status of this, DigiProvenance is now implemented as an alias of an EnumToStringSparse<> specialization, in the usual pattern, and there are no issues using this for the "intended" workflow of mixing external data with current simulation. So, barring further comments, this is ready to merge.

But something we've learned here is that, because StrawDigiMC contains an art::Ptr<> into a SimParticleCollection, which is itself not considered in the current digi mixing implementation, trying to digi-mix preexisting simulation will not work as expected, because the information in the preloaded StrawDigiMCs is incomplete, as art::Ptr<SimParticle>s cannot be resolved. We'll need to decide what the preferable way to deal with this: either by allowing to mix the full extent of the MC information necessary for complete MC truth inspection, or by dropping support for the MC truth information when digi-mixing preexisting simulation.

@kutschke
Copy link
Collaborator

kutschke commented Jul 1, 2024

I think you are saying that you plan to deal with the outstanding issues in a new PR. Is that correct? Will old workflows work correctly in the mean time?

@edcallaghan
Copy link
Contributor Author

Yes, and yes. No existing workflows are effected, but the projected workflow for mixing simulation onto simulation needs to be adjusted, which can be a separate PR.

@kutschke
Copy link
Collaborator

kutschke commented Jul 1, 2024

but the projected workflow for mixing simulation onto simulation needs to be adjusted, which can be a separate PR.

That works for me. Presuming the CI comes in cleanly I will approve and merge

@FNALbuild
Copy link
Collaborator

☀️ The build tests passed at 4ead963.

Test Result Details
test with Command did not list any other PRs to include
merge Merged 4ead963 at 465477c
build (prof) Log file. Build time: 04 min 10 sec
ceSimReco Log file.
g4test_03MT Log file.
transportOnly Log file.
POT Log file.
g4study Log file.
cosmicSimReco Log file.
cosmicOffSpill Log file.
ceSteps Log file.
ceDigi Log file.
muDauSteps Log file.
ceMix Log file.
rootOverlaps Log file.
g4surfaceCheck Log file.
FIXME, TODO 🔶 TODO (0) FIXME (9) in 13 files
clang-tidy 🔶 4 errors 1256 warnings
whitespace check no whitespace errors found

N.B. These results were obtained from a build of this Pull Request at 4ead963 after being merged into the base branch at 465477c.

For more information, please check the job page here.
Build artifacts are deleted after 5 days. If this is not desired, select Keep this build forever on the job page.

@kutschke kutschke merged commit 34a99ea into Mu2e:main Jul 1, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants