Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

39 add surface id member to edm4eictrackpoint #41

Merged
merged 5 commits into from
Aug 31, 2023

Conversation

tylerkutz
Copy link
Contributor

@tylerkutz tylerkutz commented Aug 11, 2023

Briefly, what does this PR introduce?

This adds two uint16_t members to TrackPoint to identify the detector and surface on which the TrackPoint lies (implemented for track projections but could be used more generally).

This change replaces the original change which added a single 32-bit integer. This was done following a discussion in the S&C meeting on August 23, 2023. Summary of proposed options can be found here.

What kind of change does this PR introduce?

@tylerkutz tylerkutz linked an issue Aug 11, 2023 that may be closed by this pull request
@tylerkutz tylerkutz requested a review from a team as a code owner August 11, 2023 20:56
Copy link

@ruse-traveler ruse-traveler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable! Just for my own understanding: this is to help the data model play nicely with ACTS, right? So then each tracking surface you can project to has a unique index, which is recorded by this member?

@tylerkutz
Copy link
Contributor Author

tylerkutz commented Aug 15, 2023

@ruse-traveler The motivation was more to keep track of which calorimeter track projections belong to, but there's no reason it couldn't be used for the tracking detectors as well.

After discussion with Wouter and Dmitry, the plan was to use the detector IDs defined here. At least for the calorimeter projections...

@wdconinc
Copy link
Contributor

Detector IDs don't define surfaces. One detector can have multiple surfaces.

@tylerkutz
Copy link
Contributor Author

For my purposes, a single detector ID is sufficient information. In that case I would simply re-name the member to clarify.

Unless other people have applications that would use this member and actually need to keep track of surfaces (not just detectors).

@veprbl
Copy link
Member

veprbl commented Aug 15, 2023

Perhaps, renaming "surface" -> "system" (to be named after the readout field), that would be helpful enough? Also, it's safe to assume it's uint8_t.

@veprbl
Copy link
Member

veprbl commented Aug 15, 2023

@wdconinc @sly2j What do you think?

@wdconinc
Copy link
Contributor

I think surface is fine. But you have to store a surface ID in it, likely from Acts in tracking. Think of other use cases: how does a dRICH distinguish between the different surface they want to project to (front of radiator, back of radiator, front of gas volume, mirror): you can't just all give them a system value that is the same.

@veprbl
Copy link
Member

veprbl commented Aug 15, 2023

I think, it's not contested that tracking/projection is done via/to points on individual surfaces, not detectors. Perhaps ACTS surface id is a way to go, however I am not sure how those are assigned for surfaces that are created dynamically. I don't think it's obvious that their creation should be moved to geometry (don't we want to be able to configure depth and what not in the algorithm?). If that's resolvable, the mapping of ACTS surface to DD4hep detector/readout should be trivial to do, I assume.

@tylerkutz
Copy link
Contributor Author

The motivation for this was simply to associate a track projection with a calorimeter so we know which calorimeter clusters to match to the track. For this purpose, it doesn't need to be more complicated than a "detector ID".

Even if this needs to remain a "surface ID" to be generally useful to others, it would not work to require that the projection surfaces be actual detector surfaces. We would like the ability to adjust arbitrarily where the projection surfaces are (to account for e.g. average cluster depth).

@veprbl
Copy link
Member

veprbl commented Aug 15, 2023

The motivation for this was simply to associate a track projection with a calorimeter so we know which calorimeter clusters to match to the track. For this purpose, it doesn't need to be more complicated than a "detector ID".

We are seeking to arrive to a meaningful EDM. Adding fields with a minute needs in mind is known to lead to complications in the long run (bloated, convoluted structures).

One alternative to surface id's (for all I know, they might be memory pointers that can't go into EDM) is to serialize the whole ACTS surfaces and have one-to-one relationships to them.

@jdbrice
Copy link
Contributor

jdbrice commented Aug 16, 2023

Hi @veprbl and @wdconinc I had a discussion with Tyler about this so that we can make some decisions and move forward.

A couple of things:

  • I completely agree with @wdconinc that we need (at least in principle) the ability to have more than one projection per detector
  • We should solve this by doing what was originally discussed with S&C people - to use the 32 bit integer efficiently, so in this case that would mean e.g. 24 bits for a detector id and 8 bits for the surface id. However, we could resize those as details require (detector id currently needs only 8 bits)
  • I share @veprbl concern about using ACTS surface ids and their potential dynamic nature. I think there are several acceptable solutions here, but we do not need to decide now.

I want to strongly push back on this perspective that these are "fields with a minute needs in mind"
As someone who has built large swathes of the software for a running experiment - I can assure you that robust track projections are a crucial piece of info used by almost every other detector subsystem - essentially every calorimeter and every PID detector at a minimum. More often than you might expect this info is even used in physics analysis level decisions.

So lets discuss more, but I want to stress: We can agree on this PR and move forward without having to decide the above specific details, including what exactly the surface id is - I'd like to understand a few more details about the ACTS surface ids (how they are generated and what flexibility we have) before we agree on that. For instance, a sequential surface id would be acceptable for current needs (and simply zero for the general case where we have 1 surface / detector).

@wdconinc
Copy link
Contributor

robust track projections are a crucial piece of info

I think no one is disputing this. But we should think about of how this new surface field allows us to do this better that what is possible right now. Right now, you would have to loop over all TrackPoints in a TrackSegment and check their position at every step, and break on the first TrackPoint in your detector. With the proposed solution, you would loop over all TrackPoints in a TrackSegment and check their surface ID at every step, and break on the first surface ID in your detector. For this change we are adding a new field in a component. We cannot assume that a TrackPoint is on a surface (it is just a point on a fitted track), so the surface field must allow for being unused. Since TrackPoint is a component, it can also not be stored in a collection, but only be part of other datatypes.

If we are interested in evaluating efficiently which tracks have endpoints on a specific calorimeter, then it makes sense to allow this selection without looping over the TrackPoints because the information has been available before. I haven't looked at the TrackProjector in a while (well, I just looked), but in the past the idea was that there would be algorithms that would project tracks to e.g. the calorimeters, and write a single TrackSegment that ends on that calorimeter, as obtained from the Acts geometry in the form of a surface id. The back() of the TrackPoint collection then is on that calorimeter. It changes finding from O(n) to O(1), and removes the need for storing Acts surface ids in the data model.

That's why this needs more discussion, I think.

@tylerkutz
Copy link
Contributor Author

tylerkutz commented Aug 16, 2023

First, even if both require a loop, matching based on a detector ID is an improvement over matching based on position, as communicating a list of defined IDs is much easier and less error-prone than expecting people to have complete knowledge of the detail detector geometries.

Second (which somewhat informs my first point), is that we want this information to be usable outside of EICrecon. Perhaps within EICrecon there are more efficient ways of grabbing a point projected to a certain detector, without a loop. But if I want to hand someone a EICrecon output ROOT file with track projections and ask them to analyze something, this solution is likely not applicable. Detector and/or surface IDs would be very useful in that case.

@tylerkutz
Copy link
Contributor Author

Also, the current implementation of a TrackPropagation_factory (was planning to submit PR soon) has a TrackSegment for each track, with each TrackPoint representing an intersection with a calorimeter. So for each event the projections from all tracks are stored in a TrackSegmentCollection.

@jdbrice
Copy link
Contributor

jdbrice commented Aug 16, 2023

Hi @wdconinc. OK glad to hear that we are in agreement. What I meant was that every track point should have the detector id defined, but in the general case surface id would be zero (unused) as you point out.

I am not sure I understand you second half of message. Are you suggesting we'd have a TrackSegmentCollection for all tracks the end on a specific calorimeter? and a separate collection for each detector? It seems that would be better as you would have things grouped in the most usable way. I don't see how getting a particular track projection is O(1) but maybe you mean effectively since you would use every one as you loop over in a normal use-case, so effectively O(1)?
I am certainly open to any idea that give O(1) lookup

@jdbrice
Copy link
Contributor

jdbrice commented Aug 17, 2023

I agree with the above points made by @tylerkutz and @wdconinc but these are questions that should be discussed on the eicrecon PR, not on this PR. I want to avoid a chicken and egg situation and we need this in so that we can get the EICrecon PR in before the next cutoff.

Based on all of the comments we all agree that we need track projections and that we need an ID associated with them to identify their "location" when used as a projection. There is really only two ways to solve this situation, either the ID is added as we have here, or a new structure is added to provide lookup between the track point and the ID (e.g. an Association type with a oneToMany relationship). I don't think anyone is suggesting the latter here, thus my earlier suggestion to move forward and have additional discussion with the next PRs (on EICrecon).

@veprbl
Copy link
Member

veprbl commented Aug 17, 2023

I did some research on what kind of id an ACTS surface can have. Apparently, it encodes quite specific information about GeometryObject's place in a geometry hierarchy:
https://acts.readthedocs.io/en/latest/core/geometry.html#Acts::GeometryIdentifier
That made me doubt that it could be meaningfully initialized for our ad-hoc surfaces. Looking further into the bit fields, it looks like an extra is allocated for experiment use and even suggested to be used for subsystem id:
https://github.com/acts-project/acts/blob/1c459a94eab4fbe52303bd1ba1437f72b318a1a8/Core/include/Acts/Geometry/GeometryIdentifier.hpp#L59-L60
Given that it's in the least significant byte, that pretty much matches the proposed PRs and @jdbrice's suggestion above. Unless we are allowed to play with other fields (like layer), that still means we only allow ourselves one ad-hoc plane per detector. It is not clear how one could register a surface made with simple Acts::Surface::makeShared to be visible to a ACTS for lookup (https://acts.readthedocs.io/en/latest/api/class/class_acts_1_1_tracking_geometry.html#_CPPv4NK4Acts16TrackingGeometry11findSurfaceE18GeometryIdentifier - this one?). I think, if we add a documentation comment, this is a fine compromise.

Serializing Acts surfaces is not an option, probably. We don't have a way to write out-of-band frames to store collections (or so, I think, we need).

I would agree with @jdbrice that we need to look at the code to be put in EICrecon, but I don't think that's a pass on this review. Would be nice to start a PR for https://github.com/eic/EICrecon/compare/480-writing-out-track-propagation-information, but looking at eic/EICrecon@876d6e5, I have a question regarding the use of the proposed field. So, @tylerkutz, do I understand correctly that your plan is to have a separate factory down the line pick up the projected points and try to match them to some close-by calorimeter cells? I'm asking because if it was the same factory you would not need it, and if it was one projecting factory per calorimeter you also would not need it. Please forgive me if this was already thought of or even discussed.

@jdbrice
Copy link
Contributor

jdbrice commented Aug 17, 2023

@veprbl thanks for the additional information. There is one part I am not following.

"Unless we are allowed to play with other fields (like layer), that still means we only allow ourselves one ad-hoc plane per detector."

I don't see how this follows. Any particular trackPoint can only be associated with one detector + surface yes, but I don't see how the above limits us to only one surface per detector. Can you elaborate?

So if I understand it seems to me the pushback is modifying the existing TrackPoint definition, and trying to come up with a data model that will never change (i.e. if we decide later we need a different type of surface ID).

As I mentioned earlier we have only a few proposed options here:

  • original suggestion: exactly what we have here, i.e. add a member to TrackPoint
  • a new type that contains a TrackPoint (or TrackPoint ID) and a surface ID. In this second case it acts more like an association container.
  • @veprbl also suggested ephemeral track projections, making them on the fly and using them immediately

Regardless of the above, the other point is how do we identify surfaces. So far it sounds like there is really nothing gained (and a lot of complexity added) by trying to use ACTS surfaces. It seems like that leaves only a basic numerical type ID as the only option - no?

@veprbl
Copy link
Member

veprbl commented Aug 17, 2023

"Unless we are allowed to play with other fields (like layer), that still means we only allow ourselves one ad-hoc plane per detector."

I don't see how this follows. Any particular trackPoint can only be associated with one detector + surface yes, but I don't see how the above limits us to only one surface per detector. Can you elaborate?

To put it simply, if you enumerate your surface's with detector id in the LSB and all other bits stay zero, you can only assign only one surface with unique ID to a given detector. If you want to assign a non-unique ID, you'd need to touch other bits, which might (and might not) violate some of ACTS'es internal assumptions (surface belonging to an ACTS layer, belonging to an tracking volume, etc).

So if I understand it seems to me the pushback is modifying the existing TrackPoint definition, and trying to come up with a data model that will never change (i.e. if we decide later we need a different type of surface ID).

off-topic: I would not frame it as primarily a pushback. We are looking to gain some wholistic understanding of how all the reconstruction pieces fit together. Immutability of EDM is not a goal at all. We should be capable to always modify it to serve us better, although there is a penalty in terms of support for every change. That, IMO, slightly outweighs the penalty of an upfront discussion.

As I mentioned earlier we have only a few proposed options here:

  • original suggestion: exactly what we have here, i.e. add a member to TrackPoint
  • a new type that contains a TrackPoint (or TrackPoint ID) and a surface ID. In this second case it acts more like an association container.

Yeah, this a good one, but rightfully not in the top of our list.

  • @veprbl also suggested ephemeral track projections, making them on the fly and using them immediately

My guess is that this is essentially how DRICH dodges this question.

One important piece of context that I've learned today, is that neither edm4eic::TrackPoint nor edm4eic::TrackSegment are referenced any actual tracking code, it's all for RICH support currently. With the scope reduced, it is easier to decide.

Regardless of the above, the other point is how do we identify surfaces. So far it sounds like there is really nothing gained (and a lot of complexity added) by trying to use ACTS surfaces. It seems like that leaves only a basic numerical type ID as the only option - no?

Tentatively, I'm in favor of the current approach. Let's review the EICrecon PR.

@ruse-traveler
Copy link

Hi all,

Thank you for the discussion, and thank you @tylerkutz for the clarification! Based on the discussion, I too am in favor of the current approach. If there are no further objections, I'll approve this PR and we can move the discussion to the EICrecon PR.

@tylerkutz
Copy link
Contributor Author

@ruse-traveler, thanks for following up. I might suggest we wait another day or two before approving the PR...@sly2j asked that we present the proposed change in tomorrow's software meeting.

@tylerkutz
Copy link
Contributor Author

As discussed in the S&C meeting this morning, I have uploaded to the Indico meeting page (in addition to the original presentation) some followup slides summarizing the proposed options and our (Daniel's and my) recommendations for proceeding.

I hope this will start a discussion on converging this issue so we can move forward in a timely fashion.

@wdconinc
Copy link
Contributor

Since the top summary of the PR becomes the commit message, can you edit the top summary to reflect the current consensus and potentially outstanding points of discussion? You can keep the original text but type this summary above it.

…16-bit integers for TrackPoint detector and surface ID
@tylerkutz
Copy link
Contributor Author

tylerkutz commented Aug 25, 2023

@wdconinc I've fulfilled your request, thanks for the suggestion. (And I deleted my previous response, I was not aware that I could commit more changes to the same PR).

@veprbl
Copy link
Member

veprbl commented Aug 30, 2023

I must admit, after the meeting last week it's not clear to me what the resolution was.

The latest version by proponents proposed

uint16_t detectorID;
uint16_t surfaceID;

I have three suggestions:

  • use expected byte widths (you can order by descending size to avoid padding, if so desired)
  • drop ID suffix, as it's not part of our current naming scheme (I guess, exception is cellID, named by DD4hep)
  • rename detector -> system to match the naming convention in the DD4hep readout
uint8_t system;
uint64_t surface;

I think it's not unreasonable to try to finish discussing this before we are to tag for September. I urge people who want to contribute to the discussion to speak up ASAP.

@jdbrice
Copy link
Contributor

jdbrice commented Aug 31, 2023

Hi @veprbl, I am fine with the naming changes suggested. I can do that now - can someone grant me push access @wdconinc @tylerkutz @ruse-traveler?

I think the path forward was pretty clear so I won't comment on that.

"I think it's not unreasonable to try to finish discussing this before we are to tag for September."
Good. As agreed, this MUST be in by the Sept sim campaign (we discussed this point also in the convener meeting). If something needs more discussion I'd be happy to continue this in the next month given that we have this initial version in for the Sept campaign.

@tylerkutz
Copy link
Contributor Author

This proposal is fine with me. I will implement changes now and push them (hopefully within an hour).

@tylerkutz
Copy link
Contributor Author

uint8_t is not allowed. I am proceeding as proposed above but with uint16_t for system.

@wdconinc
Copy link
Contributor

can someone grant me push access

Invited you to eic organization, and when you join you should be member of ePIC Devs, which will give you push access..

@tylerkutz
Copy link
Contributor Author

@veprbl @jdbrice

I've just pushed what we discussed above, changing the type/names of the members to:
uint64_t surface
uint16_t system

I could not use an 8-bit integer for system as uint8_t is not allowed in the data model.

I will push the corresponding changes to the EICrecon track propagator factory in a few minutes.

veprbl
veprbl previously approved these changes Aug 31, 2023
Copy link
Member

@veprbl veprbl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sly2j
Copy link
Contributor

sly2j commented Aug 31, 2023

Hi everyone. Glad to see this making progress. However, I'm a bit confused as to why we now have a 64-bit + a 16-bit member rather than two 16-bit members. What is the reasoning behind this?

@veprbl
Copy link
Member

veprbl commented Aug 31, 2023

64 bit allows to fit the ACTS surface id for use in tracking or RICH projection codes that rely on surfaces defined in the detector geometry.

Copy link
Contributor

@sly2j sly2j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the changes I pointed out are addressed, we can move on and get the PR merged in.

edm4eic.yaml Outdated Show resolved Hide resolved
@tylerkutz
Copy link
Contributor Author

Just pushed changes requested by @sly2j

@sly2j sly2j merged commit de6035f into main Aug 31, 2023
3 checks passed
@sly2j sly2j deleted the 39-add-surface-id-member-to-edm4eictrackpoint branch August 31, 2023 19:48
@wdconinc
Copy link
Contributor

wdconinc commented Sep 1, 2023

This is schedule to be included in the eic-shell nightly upgrade, https://eicweb.phy.anl.gov/containers/eic_container/-/merge_requests/717.

github-merge-queue bot pushed a commit to eic/EICrecon that referenced this pull request Sep 6, 2023
### Briefly, what does this PR introduce?
This PR introduces a TrackPropagation_factory that uses the
TrackPropagation algorithm to project tracks to calorimeters.

Propagated track points are stored in a TrackSegmentCollection that is
saved in the PODIO output.

Have not yet added association between each edm4eic::TrackSegment and
the underlying edm4eic::Track.

### What kind of change does this PR introduce?
- [X] Bug fix (issue #480)
- [X] New feature (issue #480)
- [ ] Documentation update
- [ ] Other: __

### Does this PR introduce breaking changes? What changes might users
need to make to their code?
Requires new int32_t member to edm4eic::TrackPoint to store
detector/surface ID (int32_t) (see
eic/EDM4eic#41).

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Dmitry Kalinkin <dmitry.kalinkin@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add surface ID member to edm4eic::TrackPoint
6 participants