Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: interpolation out of bounds with ATLAS B-field map #1484

Closed
timadye opened this issue Sep 1, 2022 · 22 comments
Closed

bug: interpolation out of bounds with ATLAS B-field map #1484

timadye opened this issue Sep 1, 2022 · 22 comments

Comments

@timadye
Copy link
Contributor

timadye commented Sep 1, 2022

Using the propagator with the ATLAS B-field map can give error messages:

TrackFinding ERROR Step failed with MagneticFieldError:1: Interpolation out of bounds was requested

This can give a lot of messages: ~259 MagneticFieldErrors per ttbar+μ=200 event, each of which produces 2 errors, and 4 warnings (1.6 GB logfile for 1000 events)!

This error is due to InterpolatedBFieldMap::getField() called for a position outside the mapped ATLAS B-field

  • The ATLAS B-field is mapped in (-10000,-10000,-15000) to (10000,10000,15000)
  • but I see, calls for e.g. getField({-2849, -2486, 54450})
    • These are presumably for extension of seeds that will eventually be cut or rejected

#1467 removes these error messages by trying to return the edge-bin values instead of giving an error. That is still WiP because I couldn't yet get it to pass the unit tests: seems to work with B=(0,0,0) at the edges (as for ATLAS), but not otherwise.

Even if that can be resolved, the question remains: why is getField() called for points so far outside the detector? @Corentin-Allaire suggested (ACTS-ITk meeting) that the world volume is too large. If so, it may be an ATLAS-specific error, but I hope it's OK to discuss here.

@paulgessinger
Copy link
Member

I think the underlying issue is that the propagation even gets to z=54450. This is outside of the bounds of the tracking geometry, so is presumably a navigation / propagation failure.

@andiwand
Copy link
Contributor

andiwand commented Sep 1, 2022

since it happens in TrackFinding I assume it is due to CKF? I think the CKF does not use the end of world aborter and that might be the reason. the problem in the CKF is that you cannot use classic aborters because they would quite all branches at once. so I think this would be an additional step inside the actor? no clue tho how the particle sneaked through the geometry and where the navigator thinks he is going. since the z component is really large it might have slipped out through the beampipe?

@paulgessinger
Copy link
Member

That's possible yeah. I guess you're right that EndOfWorld can't be used directly, that's a good point.

It might have slipped through beampipe, but I guess the normal abort mechanism would have / should have caught it in this case.

@stale
Copy link

stale bot commented Oct 1, 2022

This issue/PR has been automatically marked as stale because it has not had recent activity. The stale label will be removed if any interaction occurs.

@stale stale bot added the Stale label Oct 1, 2022
@paulgessinger
Copy link
Member

Ok coming back from this: should we discuss this in the developers meeting to try to agree on a solution?

I personally think we should stick to explicit warnings and extend the magnetic field map(s) where this occurs. At the very least we should make this configurable and default to an explicit error.

@andiwand
Copy link
Contributor

since it happens in TrackFinding I assume it is due to CKF? I think the CKF does not use the end of world aborter and that might be the reason. the problem in the CKF is that you cannot use classic aborters because they would quite all branches at once. so I think this would be an additional step inside the actor? no clue tho how the particle sneaked through the geometry and where the navigator thinks he is going. since the z component is really large it might have slipped out through the beampipe?

So I think this issue is related to what I tried to fix here #1454. Currently CKF does not support standard aborters and will propagate particles much further than necessary.

@stale stale bot removed the Stale label Oct 11, 2022
@timadye
Copy link
Contributor Author

timadye commented Oct 11, 2022

I see #1454 has been merged. Thanks!
Unfortunately I get as many
TrackFinding ERROR Step failed with MagneticFieldError:1: Interpolation out of bounds was requested
errors as before.

@paulgessinger
Copy link
Member

Ok. Could you try to get a verbose navigation output when the error occurs? This would confirm if this is in fact due to a navigation failure.

@andiwand
Copy link
Contributor

I see #1454 has been merged. Thanks!
Unfortunately I get as many
TrackFinding ERROR Step failed with MagneticFieldError:1: Interpolation out of bounds was requested
errors as before.

interesting. when I saw getField({-2849, -2486, 54450}) I thought these errors could get caught by the path limit aborter. maybe we would also need an end of world aborter in the CKF?

@stale
Copy link

stale bot commented Nov 12, 2022

This issue/PR has been automatically marked as stale because it has not had recent activity. The stale label will be removed if any interaction occurs.

@stale stale bot added the Stale label Nov 12, 2022
@timadye
Copy link
Contributor Author

timadye commented Jun 14, 2023

I have removed the workaround for this issue from #1467 and confirmed that the errors still occur with the latest main (ITk ttbar+PU200):

19:34:43    Propagator     ERROR     Step failed with MagneticFieldError:1: Interpolation out of bounds was requested

So back to square one. What next?

@andiwand
Copy link
Contributor

From my perspective there are two solutions

  1. Adding an "out of world/detector" aborter to the propagation which has the same bounds as our magnetic field map
  2. Embed the magnetic field map into an otherwise constant field. As soon as we step out of the known field we will force it to zero

@paulgessinger
Copy link
Member

paulgessinger commented Jun 14, 2023

But this is a navigation failure isn't it?. I still don't think we should be hiding this by patching the B field interpolation.

Concretely:

  1. Would result in an abnormal termination which has the same effect as this out of bounds error
  2. Would result in a silent invalid propagation that we wouldn't be able to detect later on, which I think is worse than the warning.

@andiwand
Copy link
Contributor

But couldn't the propagation in principle leave the detector without the track finding noticing it? In this case you will always reach the point without any valid magnetic field, no?

I think we did the same with the stopped particle no? We abort the CKF propagation and continue as if the propagation is complete.

ad 2:

I don't think we should change our B field interpolation implementation but the user might want to have a zero field outside the interpolation so they could just decorate our interpolation with a default value outside the defined region.

@paulgessinger
Copy link
Member

In principle this should happen when the outermost volume is hit, i.e. there's no destination linked to the boundary surface.

@andiwand andiwand removed the Stale label Jun 15, 2023
@andiwand
Copy link
Contributor

@timadye could you try to isolate the particle/seed which is causing this failure? If the propagation should stop at the end of the detector and it does not I agree that we have an underlying propagation/navigation problem that needs to be fixed.

I am happy to help. Just need to setup ITk locally.

tagging @noemina because of ITk relevance

@timadye
Copy link
Contributor Author

timadye commented Jun 15, 2023

We get 3518 interpolation errors per ttbar+PU200 event, so it's not difficult to reproduce 😄.

Most errors print the initial parameters, eg.

15:23:18    Propagator     ERROR     Step failed with MagneticFieldError:1: Interpolation out of bounds was requested
15:23:18    TrackFinding   ERROR     Propapation failed: MagneticFieldError:1 Interpolation out of bounds was requested with the initial parameters:
    11.7714
   -18.7911
-0.00954445
     3.0233
   0.108137
    682.678
15:23:18    TrackFinding   WARNING   Track finding failed for seed 38 with errorMagneticFieldError:1

I'll try to narrow it down in a smaller case and print more debug. What information do you want?

@andiwand
Copy link
Contributor

I would hope all of them are caused by the same underlying problem but lets see. I fear I also need the surface but at this point it might be easier if I run the ttbar PU200 myself. is this plain full chain itk?

@timadye
Copy link
Contributor Author

timadye commented Jun 15, 2023

yes, default full_chain_itk.py just enabling ttbar_pu200=True. One event is enough! You even see it with ttbar npileup=1 - but with a manageable number of errors.

@paulgessinger
Copy link
Member

I don't think we see this in ODD ttbar pu200, which would point to a navigation issue in the ITk geometry

@andiwand
Copy link
Contributor

andiwand commented Jul 12, 2023

Looked into this today. The underlying bug looks quite significant to me and makes me wonder how the CKF produced any results at all...

Will create a fix and link it here.

#2299

andiwand added a commit that referenced this issue Jul 13, 2023
Apparently our CKF was not able to smooth multiple tracks for a single
seed in some cases. The problem was that we expected the CKF actor to be
called a second time after the first smoothing was done but there is a
Stepper step in between which could kick you out of the detector because
the step length is basically unconstrained.

fixes
- #1484

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
@timadye
Copy link
Contributor Author

timadye commented Jul 19, 2023

This bug is largely fixed by #2299, and completely by #2300. There are no remaining interpolation out of bounds messages from full_chain_itk.py with the ttbar_pu200 option.

Those fixes each improve the full_chain_itk.py tracking efficiency with ttbar_pu200:
trackeff_vs_eta
(I assume the improvement labelled as #2300 is due to that PR, but I actually ran over a recent main that includes both PRs. The original and #2299 points are from running without and with the draft #2299.)

The #2300 improvement is by filtering out particles from the beampipe (or outside), so doesn't indicate any improvement in the Core reco, but still good to see! 😀

@timadye timadye closed this as completed Jul 19, 2023
paulgessinger pushed a commit to paulgessinger/acts that referenced this issue Jul 24, 2023
Apparently our CKF was not able to smooth multiple tracks for a single
seed in some cases. The problem was that we expected the CKF actor to be
called a second time after the first smoothing was done but there is a
Stepper step in between which could kick you out of the detector because
the step length is basically unconstrained.

fixes
- acts-project#1484

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants